Wrong information on article "Git Fetch vs Pull"

This article:

It shows up as the top search result on Google for me, including a snippet of text answering my query:
fetch vs pull git
The sentence highlighted by the link above and in the screenshot below is FALSE:


git fetch does download more than meta data:
(remove the space below, I can’t post 2 links)
git-scm .com/docs/git-fetch
I regard freecodecamp.org as a trusted source with a great mission.
I hope this is recified ASAP to avoid confusing or misleading anyone else.
Cheers,
FK

https://git-scm.com/docs/git-fetch

Fetch branches and/or tags (collectively, “refs”) from one or more other repositories, along with the objects necessary to complete their histories. Remote-tracking branches are updated (see the description of below for ways to control this behavior).

The actual updates to the files in the repository are not pulled in a git fetch, only the git information about the changes.

Your response is correct although it does not make the claim above (highlighted in purple) true.

It makes it perfectly true. The updated files in the remote repository are not transferred by git fetch, only information about what changes are present on the remote repository.

Do you want to test it yourself or should I record a video demonstration?

I use git every day. It’s part of my job.

git fetch gets what changes are available. It doesn’t make new files in your repo or apply any changes.

But it does download the files to your computer. New files, if any, are actually downloaded to your local repo, not just meta-data.

No local changes to your current working branch occur during git fetch. The metadata needed to apply any remote changes is download but that goes into the .git folder. The new files themselves are created with a pull or a merge command following the fetch.

I think the highlighted part could be clearer, but the author is trying to create a simplified explanation. I still wouldn’t say this article is ‘FALSE’, just not clearly worded as I would want to see in official technical documentation created by a codebase. Fortunately, this isn’t the official documentation, so I am more forgiving of less formal language.

I have no idea who wrote the article, but it would probably be clearer for them to have said that the metadata is downloaded without applying them with a fetch and the metadata is downloaded and used to apply the changes with a pull.

I do understand the need to simplify concept for beginners. If scrutinized, every article geared towards begginers are technically wrong in how they explain their code. Most of those do not matter unless they are giving you information that is false enough to mislead you in getting a false understanding of the concept they are educating about.

This article is such a case. I am taking time to report it because it would have had a very significant impact on our company’s project if had not check other sources as well. Here is my case and why this is beyond “just a simplification”.

I am working on the autoupdate feature of our new electron app. The built-in or otherwise supported updaters seem to download the complete app with all its files. This is not ideal for our usecase since we need to have the app update itself on muliple remote computers that use limited and expensive data plans. Getting the changes via a git repo is vastly more data efficient. The main executable file is 139 MB. This is were how the article is worded and the way you use the word meta-data are so far from what actually happens that it actually is well into ‘false’ and ‘misleading’ territory.

The article says that it doesn’t transfer the files, it just checks if there is any changes compared to the local version via the meta-data. After reading it, I thought it would just compare the latest commit hashes to see if there is a newer one on the remote repo. In fact it would check AND download in my case all that 139 MB executable to my machine, making it available to merge afterwards while OFFLINE since it would be then on my machine already. That makes the “doesn’t do any file transferring” claim false and misleading.

You are using the word meta-data wrong when trying to defend the article. That puzzled me a bit as well. When you say “The metadata needed to apply any remote changes”, meta-data alone does not contain the information from the new commit - that’s data! A hefty 139 MB of data in my case. Meta-data would be the commit hash, timestamp, … - information ABOUT the data, but NOT the data itself, by definition of the word.

The last paragraph of the article also say that fetch checks for changes and and that you need to pull to apply the changes, leaving the information that fetch actually gets the data to apply the changes on your computer already.

All this article would need to be correct and remain simple would be to just mention that fetch downloads all the data for new changes and keeps it ready to use but does not apply it files to the files the user is working with.

This genuinely feels more like a misunderstanding of the command rather than an oversiplification mistake.

I am not attacking you or the author with my observation, I just want to push for improvement and clarification - especially given the prominence of this in the search results.

I get what you are saying, but you aren’t quite correct in your understanding of metadata. Git tracks the changes required to go from one commit to another. Once you have the repository downloaded, git tracks how to get from a past state to a current state. Those diffs are, to me, are metadata that describes how to apply changes to the repository data.

If you are putting a compiled executable on GitHub or another repository, you are probably using Git in the way it is not intended… The typical usage is for source files, not deployed artifacts. Changing the source, compiling to an executable, and then tracking the changes to the executable inflates the information required to recreate the changes between two commits.

I agree that the description could be improved, but it isn’t as horrificaly as wrong as you think. The metadata about changes in each commit are small unless you are doing something bizarre like changing the entire repo on every single commit.

Git tracks changes to text files, not the files themselves. What you’re talking about isn’t the data itself, it’s the changes to the the data. Because you have committed a binary, that’s practically equivalent, but it’s still a record of changes.

I would like to bring the discussion back to the article. I appreciate the advice regarding the use of git. The nature of of tracked files does not change how git fetch works. Running fetch downloads all the required information to your computer, it does not merely check to see if there is an update.
The working files are not updated when running fetch alone - that is clear and correct.
The data transfer happens with fetch and all the information is ready to be applied - the opposite is mentioned in the article.
We could test to see how the article is understood by readers not familiar with git and I am sure they would think fetch merely checks to see if there is any update without transfering the changes themselves.
I was specifically looking to know if there is a transfer of repo data and information on this article conflicted with every other description of pull vs fetch I could find…
Seeing this misleading information picked up and prominently displyed like that by Google doesn’t bother any of you?
I don’t think it’s simply a pendantic issue but it looks to me it is easy to correct and clarify - it seems though that’s maybe where we disagree.

Again, this is the part where you and the article both are poorly explaining what’s happening.

When using git fetch, change (meta)data about how to go from one commit state to the most recent state is ‘fetched’. Since you are (mis)using Git for a compiled binary, you are retrieving a massive amount of data, effectively getting a new copy of the repo for every single commit. This isn’t normal. The repo data isn’t transfered, just the (normally small in correct usage) data about the changes.

Retreiving an entire fresh copy of the repo doesn’t happen on every commit in normal usage. But, since you are effectively deleting and totally replacing the contents of the repo on every single commit, you have bullied Git into retrieving a massive amount of data on each fetch.

It is not normal that git fetch downloads a new full copy of the repo. That is due to how you are (mis)using it, and not due to how Git works.

‘Transfered’ isn’t a term that has a technical meaning, so the author’s intent isn’t totally clear. But I still, yet again, agree that would be better if the author said something else, like that changes were ‘applied’ with a pull but not with a fetch.

For the sake of not starting a subjective opinioned discussion about what is and what is not a correct use or misuse of git, let’s assume my example is not applicable.

If 5 new text files are pushed to the remote repo, the next fetch call would download all the information/data+metadata for those files to the local repo. The size of those files does not change the fact that those are transfered to your computer during a fetch, not just checked to see if there is a change or update as it is led to believe from the article.

I would let the users to judge for themselves and their usecases whether the changes are big or small given their resources and limitations instead of assuming and generalizing.

The information required to create those files would be download, yes. Eggs and flour but not a cake.

In order to know what changes are avaliable, you need some sort of description of the change.

…again…

So what happens next?
The author is notified?
Someone from the forum updates the article?

the article can be updated only by the editorial team - which can be reached at editorial@freecodecamp.org

Will one of you contact them or should I?

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.