Ask HN: Difficult Interview Question
23 ransom1538 40 9/3/2025, 8:19:47 PM
I am interviewing candidates for data engineering role we have. One of the most critical questions I ask is:
"How can you transfer a file to another machine?"
I can't get anyone in an interview to answer this. I never get sftp,scp,rsync,email,usb,nas or s3 buckets/gsutil. Nothing. Nope.
I want to get into cool topics, parallel transfers, etc, nope.
Help. Is this question dated?
Obviously in the real world you need to ask follow up questions sometimes, but you have at least some context for orientation.
There's a certain kind of personality that will grind Leetcode for months but never develop a skill for questioning the question.
What kind of file am I transferring? How many machines and how frequently? Does it need to be encrypted? Who needs access? In a video game, you move the mouse around and mash buttons.
I would totally expect a good interview candidate to be able to ask questions to establish the context
Why would it be wrong to ask about the rules?
Why would a good answer start with anything other than getting at least some context for orientation?
Me: okay so that's your opinion, if that's the answer you wanted why did you ask for mine?
I did not get the job.
A while ago I was interviewing candidates for a senior frontend position. My filter question was to explain how to make a progress bar. Most candidates couldn't do this, one said it was "not what they were expecting" and that they were just expecting leetcode problems.
(for non frontend people, it's just a styled box in a box....)
Then I'd tell the story about how one time in the mid-90s I needed to transfer a huge (probably 100MB at the time) file between a Unix system and a Windows system in our office. After FTP, Samba, and a couple other methods ran painfully slow for unknown reasons, I discovered that a DCC in IRC flew at top speed.
But I'm not desperate for a job and trying not to say the wrong thing.
If you're seeing a high rate of candidates who can't answer it at all, that suggests that the population of candidates who are applying for your roles are really poor fits for the role.
Maybe worth exploring if there's ways you could change how the roles are advertised to access a different population of candidates, or perhaps ask this basic question earlier in your hiring pipeline -- e.g. as an automated screening question as a pre-req to interviewing with humans.
anyone who has actually worked for a while with large data sets should be able to answer your question. but since it "sounds simple", it probably causes most folks to freeze or go onto a tangent.
Unfortunately, because of how our industry treats interviews, people "prepare" for it and that puts even good folks in the mindset of prepared answers rather than practical answers.
So there is something going on that makes problem solving in a situation very challenging.
It's like it's hard to concentrate when the lizard brain is playing bongos on the Big Red Danger button it has access too.
They should be able to come up with thoughts about security, speed, reliability, etc.
(Never nc >> .ssh/authorized_keys. Write to another file first can't check is your key first just to be sure)
Surely the obvious answer is to go with ftp or a usb stick and then wait for the interviewer to say, well this file is 15 terabytes or something insane. Bonus points if your first suggestion is a floppy disk.
They literally don't respond at all? To be fair, I'd presume it's some sort of trick question, but a few follow-up questions (how big is the file? are there any security/privacy/etc rules involved? is there something else unusual about the computers (and are they local "machines", cloud VMs, etc)?) would resolve that quickly. Here I am practicing brainteasers and leetcode and I could get ahead by knowing you can send a file in an email.
For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
Sure it would be better if the engineers knew about infrastructure, but they don't have to know about it.
And honestly, the weeds of physical data transfer between computers is actually very complex, and I think data engineers should leave it to the pros rather than screwing around trying some stuff that's guaranteed to fail either regarding security, performance or caching.
Data engineers should know about how to deal with data content, not data transport. If that's what your role is about, you should re-brand it.
“Tell me about a time you moved a file from one machine to another…”
Then have follow-ups ready to go to introduce constraints or issues that would explore the depth of their knowledge. Say for example, “What if that file had been 1000x bigger?” Or whatever.
I think you’ll get more of what you’re looking for with this approach.
However, anyone who's ever worked with a server should be able to talk about moving files on and off of it - at least come up with one reasonable way to do it.