S
4

I was training a model on my laptop for 3 days straight before I realized I forgot to shuffle the data.

3 comments

Log in to join the discussion

Log In
3 Comments
averywilliams
Wait, doesn't shuffling just feel like one of those things you swear you did but actually didn't? @jana_fox50 I once spent a whole weekend trying to figure out why my text generator kept repeating the same paragraph over and over, turned out I fed it a sorted list of customer complaints where every third entry was "billing error." The model basically learned that pattern perfectly and just cycled through the same five complaints like a broken record. It was kinda impressive in a sad way how well it memorized that order.
6
king.wyatt
king.wyatt1mo ago
The worst part is your model probably learned the order of the dataset as a feature (which is a hilarious kind of overfitting). At least you got a really good benchmark for how much shuffling actually matters.
1
jana_fox50
jana_fox501mo ago
Right? @king.wyatt, my model once learned the date stamps.
7