The press is bad
I began making a small utility that should meet the following requirements:
- periodically, maybe twice a day, it should visit a couple of the most popular Serbian news sites and just take a snapshot of the front page, and maybe of the politics section
- the generated images (in PNG format) should be uploaded to Clodinary and thus receive a unique url
- a simple MongoDB collection should be used to gather the basic data: the name of the site, the time the screenshot was taken and the url of the Cloudinary image
- a very bare-bones express app should query the MongoDB collection and retrieve the links, serving them on a free Heroku dynamo
- everything should be covered by test (mocha, chai etc)
It is quite simple, actually. I wanted to make a simple visual collection representing the horror of the serbian daily newspapers and tabloids and, on the other hand, I want to get my feet wet with the node ecosystem.
I started with a horrid oldschool setup where I would actually save the screenshot to a png file on the server, then upload it and finally delete (manually) the image. After a bit of searching the cloudinary docs and googling, I found this interesting setup. Ire Aderinokun basically uses the buffer to avoid writing and deleting images, so I modified my initial approach.
I am still getting used to this async/await world of promises, rejections and sad puns. JS is a different beast compared to python and, hard as it seems, I really feel like it is broadening my horizons. Hooking the system with a MongoDB instance was pretty easy with mongoose, although I have to check why my script (it’s still just a standalone script, no server!) hangs when it finishes. Anyway, the screenshot urls are being stacked in mongo, and building a simple express app around it shouldn’t be difficult. I plan to deploy it on Heroku, make a small css wrapper with Bulma and… that would be it for the time being. The list of pages/sites will remain hardcoded, although it would be trivial to add a list of links to the mongo instance.
Btw, I’m calling it TraShot - because, y’know, it takes shots at trash… Like, real trash.
Ok, the timeout - Puppeteer waiting time has to be longer as per this site.