Skip to main content

Data collection

Data collection is the process by which companies gather information about individuals, often without their full knowledge or explicit consent. This information can range from basic demographics to browsing history or even biometric data.

"Data is the currency of the future."

One of the core principles that we hold dear at Gaboule is privacy. Individuals should have control over their personal information and data, without fearing that it will be misused or shared without consent.

But today, protecting this right is very challenging. Almost every proprietary software collects data. It's hard to know exactly what the software is doing in the background. All data collected can be used for analytics or targeted advertising, but also for mass surveillance or other purposes.

Examples

Here are a few examples of data processing that, in our opinion, is immoral.

A small note

Because we are a small community, it is of the public opinion that we have to be a "bad source".

We therefore ask you to not rely on our claims and think for yourself! But also understand that big companies are in business to make money, and not to tell you the truth.

Companies repeat the same corporate lies ("We care about your privacy", "We are committed to green energy") and we simply want to give you information and a different perspective.

Almost everything that is said by Big Tech (but also other smaller companies) cannot be proved (encryption, privacy policies, data storage, etc.), so don't take a word from them for granted.

Microsoft Windows

This operating system is the most widely used, and is known for being the worst in terms of privacy.

Telemetry

Microsoft collects an enormous amount of user data through various means, including:

  • Crash reports: When your system crashes or experiences errors.
  • Product usage metrics: Microsoft tracks how you use their products, such as which features you use most often and how long you spend on specific tasks.
  • Telemetry data: Details about your hardware, software, and usage patterns.

Some of the data that Microsoft collects includes:

  • Device information (e.g. CPU type, RAM, disk space)
  • Software installation and usage history
  • Browser and app usage patterns
  • Search queries and online activities

The underlined data represent what literally anyone doesn't want their device to send by automatically connecting to servers online, especially when these actions occur without their knowledge or explicit consent. Yet, everything is already happening behind the scenes, while the users are blissfully unaware of the trail of data being left behind.

Advertisements

Microsoft has also added advertisements to parts of the Windows user interface, for example the Start menu1.

Can you escape?

Microsoft likes to make the users feel in control, by giving a few options to turn off telemetry and other tracking features, but these settings are often buried deep within the system, making it difficult for average users to find them. Still, many tracking features remain enabled, because Windows ($139) apparently has a mandatory right to spy on its users - a notion that would be met with outrage if it came from a Chinese app or any other "foreign" entity to the US.

People have made tools - like AME - to help users being less spied on by Microsoft, but it's usually more of a burden for users.

So does Microsoft collect data?

Yes, because they themselves admit to it.2 And also because traffic analysis has shown that Windows 10 continues to send data even if you disable analytics3.

So, we know that Microsoft:

And we think that Microsoft most likely:

  • Primarily shares user data with advertisers to capitalize on their users.
  • Continues sharing data with the US government.

Apple iPhone

Hot topic!

It is very rare to find people who actually criticize Apple's privacy as everyone seems to love their iPhones and think they are perfect. In fact, many groups that share similar values with us think Apple's marketing has been incredibly effective at creating an illusion of privacy, while potentially collecting significant amounts of user data.

No one wants to rethink their buying decision, but we actually think it is necessary to have a neutral point of view. If you think we are wrong, please share your thoughts with us or read more here.

Small note!

This section can also apply to other Apple products or systems, as they are all very similar.

No other product is as iconic as the Apple iPhone, and its influence on modern smartphones cannot be overstated, mainly because it's been the same design for over a decade and the only "innovation" has been charging more money for slightly better specs. Although cultural branding seem to be very important to users, they care even more about their privacy - the main selling point.

Inevitably, an iPhone usually stores this type of user data:

  • Messages (including attachments)
  • Photos, videos, and other files stored on the device
  • Email
  • Contacts
  • Location data
  • Browsing history and bookmarks
  • Calendar events

Surrounded by ubiquitous advertisements touting "privacy" and "security" (these two terms are often used interchangeably incorrectly), customers choose a device with an operating system that is opaque about how user data is collected and stored. They trust Apple with access to their personal information, surrendering control over their own data in exchange for a "sleek design" and "seamless user experience".

When talking about Apple iPhone's data collection practices, we need to acknowledge that many users are unaware of how their personal data is being used. This lack of transparency can lead to an unconscious acceptance of invasive features and policies, without them ever being a witness of what's actually happening in the background.

Apple now relies on its brand image to sell products as demonstrated by their very uninspired advertisements. Time and time again, the company issues statements touting its commitment to consumer data protection, while it could as well be quietly collecting and monetizing that very same data behind closed doors. And yet, mostly due to being unaware of these issues, their customers continue to flock in droves. This perpetuates a cycle of convenience over control. It's a curious thing - but one that speaks volumes about the lengths to which some end-users go for the sake of a "smooth user experience".

Apple sometimes publishes videos like these. This one basically says that server software is impossible to trust (which is true) but if you use an iPhone, you are in control - How? Because "the software image for your iPhone is accessible to independent experts" which can verify the source code for you.

This already makes zero sense and it's really scary how they get away with this.

Firstly, the accessibility of this audit process to third-party auditors is limited, and the company tries to avoid being audited by "real" independent experts. Are they trying to hide something? Or they're just making sure to protect their "intellectual property" (which no one cares in the context of security)?

And secondly, when the data gets to the server from an iPhone, no one cares about its software image! The data already got to the server.

And in contrast to regular servers, their server infrastructure nicknamed "Private Cloud Compute":

  • Run using "Apple silicon"
  • Use the "Swift programming language"
  • Run software "with transparency built in".

All of this literally does not provide any inherent security. This seems so obvious. Using a different CPU architecture and running their opaque proprietary software doesn't magically make their servers secure! There's no difference from other servers.

CPU architecture?

What Apple is doing here is saying their servers are more private (not the same as security) because they supposedly use their proprietary chips in their servers.

This is like saying using an Intel CPU is more "private" than an AMD CPU. Comparing apples to oranges. Privacy is not obtained from a neutral component like a CPU that just executes instructions.

What influences privacy is those very same instructions, aka. SOFTWARE. And guess what? It's all proprietary in Apple!

So when is it safe to trust a server?

As Apple fails to explain it, we will tell you one of the rare cases where you can "blindly trust" the server (all of these conditions must be met):

  • The data you're putting onto the server has to be encrypted:
    • not just in transit, but at rest
    • by yourself, with your own public key (asymmetric encryption)
    • with a client you trust (ideally an auditable/audited one, unlike that iPhone software image you will never have access to)

Then you're all good, but the server can still keep metadata when you did which action (upload, download, whatever).

Of course, Apple using servers as a GPU cloud will never be private and what they're describing is pretty much impossible.

And if you thought that was impressive, they're now eager to extend "the privacy of the iPhone into the cloud to unlock even more intelligence for you." Now that's one way to be told how smart we are.

Unless the tiny minority of consumers who actually care about their digital lives decides to exercise some semblance of control over their own data, Apple will continue to give pointless arguments and deflect any criticism by pretending that their "independent experts" are somehow safeguarding your digital well-being. It could be that they're simply whitewashing the issue to maintain the illusion of transparency, while quietly profiting from your personal data. Perhaps by licensing the latter to third-party partners under contractual arrangements rather than selling it outright.

This would undermine their claim that no third-party is involved, as the data can still be shared and monetized without direct sales. Apple can effectively bend the law and do whatever it wishes, as one Apple lawyer put it:

Given Apple's extensive privacy disclosures, no reasonable user would expect that their actions in Apple's apps would be private from Apple.4

Here is an hypothetical scenario that shows how something of this nature could be orchestrated:

Why do this?

Speculation is usually given a negative connotation. The following content can be seen as a form of risk analysis, where we consider the potential consequences of certain actions or events. This is not inherently negative; in fact, it's essential for insurers, who must anticipate and prepare for possible risks.

We're not pretending to know what Apple does with their devices, but rather exploring hypothetical scenarios based on the technical capabilities of their products. This helps us contrast to FOSS, which prioritize audibility and are less susceptible to bundling or control by a single entity. Our analysis falls under the term of a "reasonable suspicion".

Apple was/is part of NSA Prism since 20125 meaning they most definitely have automatic data retrieval processes integrated in the operating system of their devices, and/or on the server side.

Their claim that everything is encrypted provides a strong legal shield against potential lawsuits, but it cannot be proved so no one should really trust it by definition.

If data that is stored locally on the devices is uploaded somewhere, it would be possible to see by analyzing network traffic. It has been done in the past, but we think it's a quite short-sighted approach when Apple has a massive ultra-wideband (UWB) mesh network (Find My) that could easily transmit data to their servers.

This approach would involve device-to-device communication, leveraging this UWB mesh network. This network enables devices to interact with one another, ultimately relaying aggregated information to Apple via slow TCP/HTTPS/WSS (or whatever reliable protocol) transfers, allowing for infrequent but almost unnoticeable offloading of sensitive information (from their line of devices that are almost all battery powered).

As a concrete example, a user's iOS device could be programmed to archive the data it has on its user at night (e.g. while the phone is being charged and not being used). Then during daytime, when the user walks with this device in their pocket/bag and runs across other Apple devices, their own device transfers small parts of this archive over UWB to the many devices that in turn upload it to Apple. Conversely, the same user walks past other devices and their device receives small parts of data to be sent.

In this mesh network, everyone would contribute to sending other people's data, effectively creating a series of temporary relays for other users' data, where each device briefly holds and forwards information from others to Apple servers.

It also works if Apple were to deceptively claim that Airplane Mode disables the UWB radio, when it is actually left turned on.

So does Apple collect data?

They most probably do (at least for telemetry) but they may also have an interest in collecting personal user data6.

So, we know that Apple:

We think that Apple most likely:

  • Continues to share data with the US government.
  • Has partnerships with other companies to share data, and monetize their users (but indirectly transmitting data through middlemen as to not directly retrace the data to Apple)
    • A good example of this is that Apple has hired subcontractors to listen to Siri recordings to transcribe recordings from users who didn't activate the Siri function on their Apple devices.
      • This work could be used to train an AI model that turns oral conversations (audio) into text, which is very sensitive data.

We think that Apple could:

  • Have a way to break the encryption from the stored data in their servers (keeping keys, or by simply NOT encrypting the data at rest)
  • Have their devices phone home by sending massive data dumps slowly, so it's not very noticeable from an end user perspective (by pinning this process to a specific special CPU core, only compressing data dumps while the phone is being charged, using UWB mesh networks instead of easily analyzable network traffic to relay data to Apple, etc.)

Fictional examples

These examples have been intentionally fictionalized to show data collection methods.

A note-taking app (fictive)

NotezX Ultimate is a smartphone app made by GiveUsMoney Inc. that allows its users to create up to 200 notes and sync them to the cloud for free. It uses Google SDKs to collect user data for "analytics" and "crash reports". GiveUsMoney Inc. also directly shares the users' notes to a few different advertising partners.

Most people would expect a note taking app to just take notes, but about 70% of the app logic is "features" that track user interaction, send their notes over network, and more. Not only does it drain the users' battery faster, it is also very privacy invasive.

Alice's blog (fictive)

Alice runs an independent blog set up by a friend. On here, she writes articles and has a decent amount of people reading those. She would like to monetize her website. She quickly finds out that Google Ads will give her a solid amount of money in exchange for a few ads on her website.

This example shows the dynamic between small websites and Google. Due to the latter dominant position in search engines and online advertising, many website owners feel pressured into partnering with Google's AdSense program. This leaves them few alternatives for monetizing their content, effectively holding them hostage and limiting their freedom of choice.

The JavaScript code that the website owners inject for every page allows Google (and other third-party trackers) to monitor user interactions on these websites, often in ways that are opaque or even hostile to users' interests.

How to fix it?

FOSS can precisely fix any data collection problem due to its transparent nature. It usually is free of intrusive data tracking or surveillance that is unfortunately too common nowadays.

Further reading

Disagree with us?

If you find yourself disagreeing with anything we wrote, please hit us up on Matrix. We appreciate open discussions about complex topics like this one.

Footnotes

  1. "New! The Recommended section of the Start menu will show some Microsoft Store apps": https://web.archive.org/web/20250723161200/https://support.microsoft.com/en-us/topic/april-23-2024-kb5036980-os-builds-22621-3527-and-22631-3527-preview-5a0d6c49-e42e-4eb4-8541-33a7139281ed

  2. https://www.microsoft.com/en-us/privacy/privacystatement

  3. https://thehackernews.com/2016/02/microsoft-windows10-privacy.html

  4. https://web.archive.org/web/20250103200825/https://storage.courtlistener.com/recap/gov.uscourts.cand.403685/gov.uscourts.cand.403685.122.0.pdf

  5. Of course, Apple would have never heard of such a thing... https://www.theguardian.com/world/2013/jun/06/us-tech-giants-nsa-data

  6. As Proton suggests in this article, Apple could be searching for new revenue sources. Big Tech often relies on advertising to make revenue, so this is a totally plausible idea.