The fact that cache invalidation fixes the issue (meaning Python is directly grabbing a copy of the file, not using your cache) likely indicates the corruption is happening in your cached document. Seems like an issue with how it's stored on the filesystem.
Can you ever reproduce the issue if discovery file caching is completely disabled?