Frequently Asked Questions
Why is iCloud Drive unsupported?
Because apparently there's no official way for an app to gain access to the whole content of an iCloud Drive (but only to a single app-specific container, for its own data). If anyone reads this and knows any better please contact me, I'm all ears!
Just as a heads up for expert users: there's an unofficial NPM library that could access the whole drive. If you know Node/JavaScript etc., you could try to use that.
Unfortunately, I can't implement that method in DeDuplicate because:
- First of all, it would be unreliable and unstable.
- Moreover, on iOS, I would run the risk of violating the App Store's "terms and conditions" or something, and they could easily remove the app or, even worse, close my developer account.
- In the description of the library, the author even suggests the possibility that your account could be put at risk!
Why is Google Photos unsupported?
Because the Google Photos API is so limited that it's impossible even to support it partially. These are the deal breakers:
- Requests about media items don't receive hash values, which are vital for an app like DeDuplicate to detect duplicates.
- An app is not allowed to delete items that weren't created by the app itself, or even move them (for example, to a new "Duplicates" album). This limitation obviously prevents an app from actually DOING something — even if the hash values were available — because it would only be able to show a list of duplicate items... not very useful.
Unfortunately, judging by the fact that people on the Internet have been asking for those features for years and Google never implemented them, we can be pretty sure that those limitations are by design and there's nothing we can do about it.
So, ok, Google Photos itself is very limited; but if you really need to delete duplicates and such, you can always take advantage of Google Takeout, the official tool offered by Google for downloading your data from their services.
A user pointed out to me this very clever workaround that is based on it and allows you to transfer everything to Google Drive (supported by DeDuplicate) without downloading and re-uploading anything. You can follow it thoroughly or take inspiration from it.
- Open Google Takeout to export all your Google Photos media, setting Google Drive as the destination (other cloud services are supported too, but only in Google Drive you can do the next step);
- When done, go to your resulting ZIP archive. You can extract it directly to your drive using ZIP Extractor;
- Now that all your media is on GDrive, you can use DeDuplicate as usual!
This method has its obvious downsides, unfortunately. First, I guess you must have enough free space for the ZIP archive (actually twice its size, even if just for a moment before deleting it, because you need the space for the extracted files too). All your stuff on Google Photos is still there, occupying your storage space; you just made a backup of it on Google Drive and, only then and there, you scanned it for duplicates.
Furthermore, if you want to wipe out all your content from Google Photos, spoiler alert: get ready for some manual work.
Why is Amazon Drive unsupported?
Because it’s discontinued:
- Its mobile apps have been taken down from the stores at the end of October 2022;
- Since January 31st, 2023 uploading files to the website has been disabled;
- It completely shut down on December 31st, 2023!
These conditions are far from ideal and, to be frank, it’s not worth it for me to support a service that ceased to exist.
All things considered, users of Amazon Drive probably need a program that transfers all their data to a different cloud service, while they still can :) If you’re familiar with the command line, a tool called “rclone” lets you transfer your contents from service to service.
I have some duplicate photos but they don't count as duplicates, why?
Because they aren't, even if they seem so to the naked eye! Keep in mind that this app searches for exact duplicate files because it compares their hash values. They must have the very same content, byte-by-byte.
Two images can appear identical but, if somehow one of them has been re-encoded, they are different files even if by any chance they happen to have the same size. If you save a JPEG as a JPEG again, you don't get the same file; the same goes for MP3s, videos... (and anything that uses a lossy compression format).
Of course, it's technically possible to detect similar pictures as well; but that's a completely different story. An app would need to download all the files and use some algorithm to do the comparison.
The app is stuck or crashed, help!
According to statistics, crashes are very rare. But they can happen. As advertised, this app makes all the heavy lifting on your device. This feature is what makes it particularly sensitive to privacy, but it has its downsides and leaves a certain degree of unpredictability because:
- The information about your files that are downloaded is then stored in memory and on your local storage so, theoretically, there could be a combination of factors (low memory + a huge amount of scanned files) that makes the app crash.
- If the network connection is unstable the scanning process can fail, and not so gracefully: it will be stuck and you will have to force quit the app.
Future versions will try to have a much better error-handling mechanism. This is just the first release!
How actually fast is this app?
In the tests conducted during development, we scanned 100.000 files in less than 10 minutes (remember: it's the number of files, not their size, that affects the time required to complete the operation). Of course, it depends upon the network connection.
There is also another factor: thumbnails. After the main phase of the scanning process, the app tries to obtain a preview of each file that has one or more duplicates. Typically, only certain file types like images and videos can have a preview; but it also depends on the cloud service you're using.
Thumbnails are very small image files of a few KBs but they still take time to download, if there are many of them!
Why this app?
Because I needed something like this myself, but I couldn't find a solution that ticked all these boxes:
- No limitations during the trial period
- No account registration needed
- Direct connection to the cloud providers, no intermediary servers
- No need to download the files and waste bandwidth
- Mobile version and cross-platform
Why should I care to remove duplicates? I have plenty of free space!
Me too, I have 1 TB of space and it took me years to fill it up halfway. But it still drives me a little crazy to know that a few thousand of those files are redundant and useless, and I want to get rid of them.
I made this app specifically to avoid the painstaking effort of doing it manually since it would have been next to impossible. Now I can just launch this app every now and then. Sometimes it finds something, sometimes it doesn't... who cares? It only takes a few minutes.
How can I achieve the same goal without an app like this?
When it comes down to it, the only way to remove duplicate files from a cloud drive turns out to be the time-consuming and unnerving procedure of downloading everything to your computer (defeating the purpose of cloud storage) and using one of the many existing utilities that delete duplicate files on the hard disk, in an “offline” way. When done, the changes must be synchronized to the cloud; in the worst-case scenario, you have to re-upload everything from scratch. That’s not a viable option when you have over 500 GB of unorganized cat pictures :) And of course, you can't do that on your phone.A certain feature is missing, can you add it?
Well, I can try! It already happened in the recent past. This project is obviously not my full-time job so I can't guarantee anything, but I certainly intend to make improvements.I don't trust you, change my mind!
Ironically, a big part of the whole reason I made this app was that I was suspicious about the other existing solutions.
Truth be told, I couldn't prove that it is 100% genuine even if I made its source code public: what guarantees that I will use that code in the published app? And, to be frank, I would still hesitate to make that choice because, although this project started as a personal utility, I ended up working on it for two years now. I wrote on Medium about some interesting parts of the app; still, I'm not comfortable with the idea of sharing the entire code (even if it's not exactly rocket science).
So yes, to a certain degree you just have to trust this app because, when you log in to your cloud provider and authorize it to manipulate your files, you can't be entirely sure that I'm not some evil scammer. But on the other hand, consider the fact that this app is published on Apple's App Store and Google's Play Store, and they claim to have very strict standards. I'm NOT saying that this is conclusive proof because it absolutely isn't, but it's better than nothing.
Also, I'm pretty sure that Google Drive, Dropbox, OneDrive, etc. keep an eye on what's going on with their APIs and they would notice any suspicious usage.
Let's put it this way:
If you're a pro, you could monitor the outgoing network traffic and see for yourself that the only connections that take place are to the API of the cloud service you're logged into, plus some extra stuff like Google Firebase for basic analytics, Sentry for error reporting, and my own server (the same server this very website is hosted on) for OTA updates where applicable.
Please also note that:
- The app can see your email address (which is usually the user name) and the profile picture of your accounts, but they're used for display purposes only so that you know which account you're logged into. I don't collect the email addresses of the users.
- Firebase and Sentry, the only two third-party services, are configured to send only anonymous data, that can't be used to track you.
Any future plans?
I intend to keep developing this app, especially if it gains some traction. There's a lot of room for improvement (especially UI and error handling) and I'd like to support even more cloud services from other parts of the world.
Compared to its very first version, a lot of progress has already been made! At first, it was for OneDrive only, now it supports all the major platforms and even more can be taken into consideration.
Didn't this app used to be free?
Recently, I've been experimenting with different pricing models. Since rolling out the initial version, I've found that Android, iOS, and (more recently) Windows have their own quirks in terms of user acquisition and conversions.
My initial aim was to offer this app entirely for free and keep the development going with ad revenue. However, this turned out to be a bit unrealistic: the earnings from ads were just too low to entertain the idea of sustaining the app in the long run.
The revenues weren't just low, but also highly unpredictable: the so-called "eCPM" can fluctuate so much that one month I'm barely scraping by, and the next month I'm down to a third of that. It's a well-known issue and not entirely unexpected, but after two years, I was hoping for something more substantial. Plus, I had a rough ride with the ad network in the past and, long story short, I've realized they can cut ties with you anytime they want, without any clear reason. It's not wise to depend solely on such an unstable source of income.
I guess ads might work better for games or apps with a massive daily user base, not so much for this one.
I've come to the conclusion that DeDuplicate is more of a niche app that people discover when they actually need it, rather than a go-to for the casual user.
Sometimes the need for the app arises on the spur of the moment (that's what the trial versions are for), in other cases it's something they want to use regularly (enter the subscription model), and at times they prefer to make a one-time payment and have it forever (the one-shot paid model).
I'm currently assessing these different pricing models, tweaking them based on the time period and user's location.
One thing I'm firm about is this: no matter the changes I make, I won't leave previous purchasers behind. For instance, if you bought the paid version of the app (or the "No Ads" in-app purchase), and next year the app switches to a subscription model, you'll still have unrestricted access to the app.
I've already implemented several measures to accommodate the needs of previous customers, but navigating through all the different combinations isn't easy. If something's wrong — for instance, the app still prompts you for a subscription even though you've paid for it, or it continues to show ads when it shouldn't — please reach out to me and I'll set things straight.