Ask HN: Difficult Interview Question

23 ransom1538 40 9/3/2025, 8:19:47 PM
I am interviewing candidates for data engineering role we have. One of the most critical questions I ask is:

"How can you transfer a file to another machine?"

I can't get anyone in an interview to answer this. I never get sftp,scp,rsync,email,usb,nas or s3 buckets/gsutil. Nothing. Nope.

I want to get into cool topics, parallel transfers, etc, nope.

Help. Is this question dated?

Comments (40)

EricRiese · 1d ago
Something bothers me about these questions. In the real world, when you're solving a problem, you have so much context. These questions are like waking up from a coma and you're in a video game and you don't even know the rules.

Obviously in the real world you need to ask follow up questions sometimes, but you have at least some context for orientation.

muzani · 1d ago
Part of the test is whether you try to know the rules. People are going to be dropped into a complex codebase with outdated documentation. For more senior roles, you want people who bring in expertise instead of just doing what they're told.

There's a certain kind of personality that will grind Leetcode for months but never develop a skill for questioning the question.

What kind of file am I transferring? How many machines and how frequently? Does it need to be encrypted? Who needs access? In a video game, you move the mouse around and mash buttons.

jdsnape · 1d ago
I don’t know, I have definitely been approached by someone with that exact statement before. Then I have to figure out what they’re actually trying to achieve to avoid an x-y problem, and figure out the right solution.

I would totally expect a good interview candidate to be able to ask questions to establish the context

paulcole · 1d ago
It seems like you’re indirectly making a lot of dangerous assumptions.

Why would it be wrong to ask about the rules?

Why would a good answer start with anything other than getting at least some context for orientation?

PenguinCoder · 1d ago
Man, if that was the level of interview questions I got, I'd be just fine I think. Unfortunately it's stump the chump, or specific, opinion based types. I remember one recently, I asked my opinion about _certain technology_. I answered what that was. Later I was told well I was expecting a different answer, I was looking for discussion on (something)

Me: okay so that's your opinion, if that's the answer you wanted why did you ask for mine?

I did not get the job.

deepsun · 1d ago
They just looked for people like them. To an extent, that's ok, e.g. nerds like nerds. Sometimes a team lead likes very much _certain technology_ and want a person to their team to also like it and not keep whining about it. I've been on both sides, and think it's alright. If the tech choice is the worst problem in that company I'm ok with that.
winrid · 1d ago
Seems like a great filter.

A while ago I was interviewing candidates for a senior frontend position. My filter question was to explain how to make a progress bar. Most candidates couldn't do this, one said it was "not what they were expecting" and that they were just expecting leetcode problems.

(for non frontend people, it's just a styled box in a box....)

gs17 · 1d ago
I'm surprised they couldn't even come up with the <progress> element.
winrid · 1d ago
Some did. Not enough.
willejs · 1d ago
Do people not start asking questions like, over what medium? Is there direct ip connectivity or nat/a firewall in between? How longs the link? How big is the file? I would try and set some parameters if people are not or are struggling.
HankStallone · 1d ago
I suppose some people might freeze because it's so vague and they're wondering what you're getting at. Personally, I'd say, "Well, in most cases I use rsync or scp, but it depends on the situation."

Then I'd tell the story about how one time in the mid-90s I needed to transfer a huge (probably 100MB at the time) file between a Unix system and a Windows system in our office. After FTP, Samba, and a couple other methods ran painfully slow for unknown reasons, I discovered that a DCC in IRC flew at top speed.

But I'm not desperate for a job and trying not to say the wrong thing.

codingdave · 1d ago
Not dated, but with zero context like that, I'd assume it was a trick question trying to see if I can come up with types of scenarios where sneakernet outperforms digital transfers.
shoo · 1d ago
Seems like a fantastic filter question to me. It's basic, it's not an unfair or trick question, there are multiple valid options that the candidate can respond with - you're not fishing for your single pet solution - and a strong candidate could demonstrate ability by going into more depth and mentioning multiple options and explaining when one option would be preferred vs others. Keep asking it.

If you're seeing a high rate of candidates who can't answer it at all, that suggests that the population of candidates who are applying for your roles are really poor fits for the role.

Maybe worth exploring if there's ways you could change how the roles are advertised to access a different population of candidates, or perhaps ask this basic question earlier in your hiring pipeline -- e.g. as an automated screening question as a pre-req to interviewing with humans.

ai_coder42 · 1d ago
I once asked a senior developer "how will you figure out which class/method is causing the server to crash". They gave me all sorts of answers other than "i'll add appropriate log statements and check the logs".

anyone who has actually worked for a while with large data sets should be able to answer your question. but since it "sounds simple", it probably causes most folks to freeze or go onto a tangent.

Unfortunately, because of how our industry treats interviews, people "prepare" for it and that puts even good folks in the mindset of prepared answers rather than practical answers.

Sn0wCoder · 1d ago
If nothing else would assume they have used GitHub or say 'drop it in teams'. Even if they have only ever used the cloud there is send a link from my Google or One Drive. My first answer would be FTP or rsync. Maybe they are over thinking it and assume they need to get it on another machine without the owner knowing about it IDK. Maybe it needs to be reworded. Or follow up if there is no answer and say how did you turn in your homework to the professors or think about how you collaborate with other developers on a project.
GarnetFloride · 1d ago
These kinds of discussions remind of those times running a TTRPG and there is a puzzle to solve and 4+ engineering students can't figure out the back-of-the-cereal-box type puzzle.

So there is something going on that makes problem solving in a situation very challenging.

It's like it's hard to concentrate when the lizard brain is playing bongos on the Big Red Danger button it has access too.

dataflow · 1d ago
What response do you get?
tibbon · 1d ago
Maybe if they are having problems, offer a clearly sub-optimal way to do it, and try to get them to talk about why it's not great or what better ways might be. That might help uncover some thought pattern or understanding about it. Such options could be FTP, as an email attachment, or the body in a single HTTP POST.

They should be able to come up with thoughts about security, speed, reliability, etc.

romanhn · 1d ago
Now I'm really curious what kind of answers you're getting. If literally nothing, sounds like this is a great filter (but also makes me wonder what kind of candidates you're choosing to talk to).
0xfaded · 1d ago
If I'm trying to transfer a public ssh key for the first time I sometimes use netcat ;)

(Never nc >> .ssh/authorized_keys. Write to another file first can't check is your key first just to be sure)

therealfiona · 1d ago
Sounds like a great weed out question if part of the job is moving data around.
haute_cuisine · 1d ago
How large is the file? Past certain sizes it's faster to move it by truck rather than available internet channel.
verzali · 23h ago
If I ask that question and your first suggestion is to use a truck, you probably aren't getting the job!

Surely the obvious answer is to go with ftp or a usb stick and then wait for the interviewer to say, well this file is 15 terabytes or something insane. Bonus points if your first suggestion is a floppy disk.

thoughtpalette · 1d ago
Maybe phrase it with a bit more specificity. "How can you transfer a file between pipelines", services, etc.
gs17 · 1d ago
> I can't get anyone in an interview to answer this. I never get sftp,scp,rsync,email,usb,nas or s3 buckets/gsutil. Nothing. Nope.

They literally don't respond at all? To be fair, I'd presume it's some sort of trick question, but a few follow-up questions (how big is the file? are there any security/privacy/etc rules involved? is there something else unusual about the computers (and are they local "machines", cloud VMs, etc)?) would resolve that quickly. Here I am practicing brainteasers and leetcode and I could get ahead by knowing you can send a file in an email.

m463 · 1d ago
You know what, just wait. When the person comes along you will know it.
icedchai · 1d ago
What do they say when you ask?
tester756 · 1d ago
File over Avian Carriers
cwmoore · 1d ago
grid-tone fluctuation origami. or netcat
scarface_74 · 1d ago
I would answer it like this:

For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.

ChrisArchitect · 1d ago
Is this one of those file under (pun intended) Gen-whatever doesn't know anything about files or filesystems because of their iphone/mobile upbringing things?
markus_zhang · 1d ago
Er, using a floppy?
d--b · 1d ago
This is because your question is an infrastructure question, not a data engineering question.

Sure it would be better if the engineers knew about infrastructure, but they don't have to know about it.

And honestly, the weeds of physical data transfer between computers is actually very complex, and I think data engineers should leave it to the pros rather than screwing around trying some stuff that's guaranteed to fail either regarding security, performance or caching.

Data engineers should know about how to deal with data content, not data transport. If that's what your role is about, you should re-brand it.

paulcole · 1d ago
Why not word it slightly differently:

“Tell me about a time you moved a file from one machine to another…”

Then have follow-ups ready to go to introduce constraints or issues that would explore the depth of their knowledge. Say for example, “What if that file had been 1000x bigger?” Or whatever.

I think you’ll get more of what you’re looking for with this approach.

cushpush · 1d ago
Another machine or ... another ip address [that is virtually a machine in a data center]?
pestatije · 1d ago
data engineering doesn't do data transfers... different subjects altogether
icedchai · 1d ago
Anyone who works with computers professionally should be able to answer this question.
daemonologist · 1d ago
This is kind of true, in that a data engineer is probably mostly thinking about data that already exists somewhere in a consumable fashion (a database, S3, an API, whatever). These usually have some default access method either built in or prescribed by devops/security/another team.

However, anyone who's ever worked with a server should be able to talk about moving files on and off of it - at least come up with one reasonable way to do it.

01HNNWZ0MV43FF · 1d ago
dang maybe I should pivot into data engineering. I'm great at that stuff. What is data engineering?
p_ing · 1d ago
Is it relevant to the role? Has someone in that role/with that title performed that type of work in recent memory? If so, it's not dated.