Dev
Fake File & MIME Type Dataset Generator
Testing file upload APIs, storage services, and media libraries demands realistic file metadata — not placeholder strings or repeated filenames. This fake file and MIME type dataset generator produces mock file records with accurate filenames, byte-range sizes, proper MIME types, MD5 checksums, and modification timestamps. Each record mirrors what a real filesystem or cloud storage bucket would return, so your tests run against data that behaves like the real thing. Choose a specific file category — images, documents, audio, video, or code — or select mixed to simulate a general-purpose storage bucket containing multiple content types. The count control lets you generate anywhere from a handful of records to dozens at once, matching the scale of the scenario you're testing against. Developers commonly use this output to seed database tables that track uploaded assets, mock the responses returned by S3-compatible APIs, or drive UI tests for file manager components without touching a real filesystem. Because the MIME types correspond correctly to their file extensions, the records also work for testing server-side validation rules that reject mismatched content types. The generated checksums are properly formatted MD5 hex strings, making them structurally valid for any code that reads or stores hashes — even though they aren't computed from actual file content. Combined with realistic file sizes drawn from typical ranges for each category, the output gives you a dataset that looks and behaves like a real file listing from the first line.
How to Use
- Set the Number of Files count to match how many records your test scenario or database fixture needs.
- Choose a File Category from the dropdown — pick a specific type like 'images' or 'documents', or 'mixed' to simulate a general-purpose bucket.
- Click Generate to produce the file records, each containing a filename, size, MIME type, MD5 checksum, and modification date.
- Copy the output and paste it directly into your test fixture file, database seed script, or API mock handler.
- If you need a different distribution of file types, run the generator multiple times with different categories and combine the results.
Use Cases
- •Mocking S3 ListObjectsV2 API responses in unit tests
- •Seeding a file_metadata table with varied MIME types for database testing
- •Testing file manager UI rendering with mixed image, video, and document entries
- •Validating server-side MIME type and extension mismatch rejection logic
- •Generating fixture data for media library search and filter features
- •Populating Storybook or Figma prototypes with realistic file names and sizes
- •Testing pagination logic on file listing endpoints with large record sets
- •Simulating cloud storage audit logs with varied timestamps and checksums
Tips
- →For database seeding, generate separate batches per category so you can control the ratio of images to documents to video in your test data.
- →When testing MIME validation rejection, generate a batch then manually swap one MIME type to mismatch its extension — the surrounding valid records provide realistic context.
- →Pair video or audio records with large file sizes when testing upload progress or chunked transfer logic, since unrealistically small sizes hide edge cases.
- →If your storage system enforces unique filenames, append the MD5 checksum's first eight characters to each name as a suffix before inserting into your database.
- →Use the code file category specifically when testing syntax-highlighting file managers or code repositories — the extensions map to real language MIME types.
- →Mixed-category output works best for testing sort-by-type and filter features, since it guarantees multiple MIME type groups appear in a single dataset.
FAQ
What is a MIME type and why does it matter for file testing?
A MIME type is a label like image/jpeg or application/pdf that tells browsers and servers how to handle a file's content. When testing upload or storage systems, using accurate MIME types ensures your validation logic, content negotiation, and rendering code all behave as they would in production — catching bugs that generic placeholder data would miss.
Are the MD5 checksums valid or just random strings?
They are randomly generated hex strings formatted to the correct 32-character MD5 length. They aren't computed from real file content, so they won't verify against actual files. For testing code that stores, displays, or compares checksums structurally — such as deduplication logic or integrity-check UI — they work perfectly.
Can I use this to test MIME type validation in a file upload endpoint?
Yes. Each generated record pairs a real file extension with its correct MIME type, so if your endpoint checks that the declared type matches the extension, these records will pass as expected. To test rejection logic, manually swap a MIME type after generating — for example, change image/png to application/pdf on a .png filename.
What file categories can I generate?
The category selector offers images (JPEG, PNG, WebP, GIF), documents (PDF, DOCX, XLSX, TXT), audio (MP3, WAV, OGG), video (MP4, MOV, MKV), code files (JS, PY, JSON, HTML), and a mixed set that combines all types. Mixed is useful when simulating a real-world storage bucket where users upload various content types.
Are the file sizes realistic for each category?
Yes. Sizes are drawn from ranges typical for each file type — thumbnail images will appear smaller than raw video files, and plain-text documents smaller than spreadsheets. This matters when testing UI features like size-based sorting, storage quota calculations, or upload progress bars, where unrealistic sizes would expose your mock immediately.
How do I use this output to mock an S3 API response?
Generate records in the mixed or target category, then map each entry to the S3 ListObjectsV2 Contents schema: Key becomes the filename, Size maps to the size field, LastModified to the date, and ETag to the checksum (wrapped in quotes as S3 does). This gives you a structurally valid mock response without needing a real bucket.
Can I generate hundreds of records at once?
The count field accepts larger numbers, so you can generate as many records as your use case requires. For stress-testing pagination logic or seeding a development database table, simply increase the count and copy the full output. For very large datasets, generate in batches and concatenate the results.
Are the filenames unique across a single generated set?
Filenames are generated to be varied and realistic, though occasional collisions are possible in large batches since names draw from a finite pool of patterns. If uniqueness is critical — for primary keys or deduplication tests — append the checksum or a numeric index to each filename after generating.