#digital preservation

LIVE

virescent-phosphor:

linguinibot:

californiasplit:

its so fucked up that optical discs straight up rot though right? something about digital media just feels like it shouldnt be susceptible like that to the forces that govern the physical world and yet discs rot as if theyre an organic thing

This also happens with digital data due to the degradation of the physical storage medias! This book (best before by james newman) talks a lot about it in the context of videogames and the the implications it has for the ongoing efforts to archive them!

I collect Laserdiscs and certain runs of them are notorious for laser rot. Afaik there is a problem with the adhesive layer and it damages the disc. Generally it’ll just start out as video artifacts and worsen over time; what’s nuts is that you really can just watch it happen if you look closely at the disc. Now, I’m a little paranoid about some of my older DiscoVision discs, lol.

And yeah the major problem with digital storage that’s mentioned in that book excerpt is damaged analog media is often still readable! It’s just damaged! So you might have missing parts, distortion, etc, but it’s easier to recover from than if critical chunks of digital data get wiped out. I think as a whole video/audio is probably less susceptible since damage may be more likely to be recoverable, but it’s potentially very bad for software. Digital is just an all or nothing type of storage even if you try to put in error correction; you just hope a loss isn’t noticed, like dropped packets on a video/phone call or a glitchy frame on a Netflix stream. A disc with a lot of errors might be compromised but readable…unless something critical is lost, and then you’re really screwed. (See: one of my floppy disks that can’t play a game at all because it gets stuck on one track.)

What’s really interesting too is how well-made some tape was. Consumer stuff isn’t always as good, and some old cassettes are notorious for “shedding”, but there was some fantastically well-made computer tape that’s probably still holding data written 40 or 50 years later.

This is why preservation and archival matters so much, including making multiple backups in case, for example, one copy of something is lost. Demonization and dismissal of emulator projects, software and video game historians, and film/music archivists, will eventually lead to tons of information being lost.

People need to step back from their desire to patent troll and sit on copyrights forever and really ask themselves what they are going to do when the discs rot and no good copies are left.

(Thankfully, I don’t think this will happen for a lot of things…maybe just for very rare or uncommon stuff.)

Before I started down a technology career path I studied art, and specifically I studied ceramics and sculpture.

One of the things that fascinates me about ceramics is it is a permanent process. When the firing is complete stoneware essentially is just that — stone. It can be smashed or ground into dust, but it will remain stone. Bury it or seal it in a tomb, and it’s not going to change much for eons.

Pottery is one of humanity’s oldest surviving art forms. We have found pieces dating back 20,000 years. We have great painted earthenware pots thousands of years old that still proudly display their scenes. We have clay tablets nearly 4,000 years old that still tell stories of swindling copper merchants.

Ceramics have come a long way since the fragile earthenware our ancestors made. Stoneware is much stronger and more durable. A well-fired stoneware is vitrified and non-porous, and thus less susceptible to destruction by water seeping into it. We’ve spent centuries studying every element and mineral on this planet to find how to get brilliant colors out of a 1400°C inferno.

Digital storage is temporary. Wood and paper will rot. Steel will rust. Paint will fade. Plastics will embrittle and disintegrate.

Stoneware is as close to permanent as we can get.

Others have written at length about what will anthropologists find when they look back at our era. Our art, largely stored digitally, will have been lost to time. Our identifying artefact will be a layer of fine particles of plastic. What will they be able to read of our culture? Of our stories?

I’ve often contemplated how we could tell the stories of our time on stoneware. How do we convert the fleeting, ephemeral nature of contemporary culture into something permanent? What would we tell? What lessons are worthy of passing on to generations far into the future? Where do we store it?

I am a child of the digital age. I came of age steeped in the culture of the internet. I have seen memes live and die. I have seen great works of art lost to shuttered businesses, failed hard drives, and the constant march of Progress. One of my hobbies is the Sisyphean task of maintaing 40-year-old digital equipment that was never intended to last more than a few years, just so their history can be remembered a little longer. But always in the back of my mind — this is only temporary.

Perhaps we can hold onto some of it, by throwing it first into the fire.

In early December, Yahoo Gedden volunteer Doranwen completed her compiling of the metadata on the Yahoo Groups that were saved by the Archive Team and the Yahoo Gedden Team. That metadata has now been uploaded to the Internet Archive:  https://archive.org/details/Yahoo_Groups_Metadata

“The spreadsheets include all 960,613 confirmed GMD groups, plus 150,394 other groups (many of which had been long-dead by the time we came across their names, referenced only in links within other groups).

At some point it may get moved into a specific collection of Yahoo Groups stuff, but at least it’s uploaded now so everyone can browse the spreadsheets and see group descriptions, member numbers, creation dates, and much more. (And should you want to, you can identify from the spreadsheets which groups we have group photos for and go view those pictures by downloading the correct tar file and extracting the raw data within to see each file.)

If you know anyone who wants to use this for any sort of research or analysis, please point them to that link! I want it to be accessible to all. “

Note this is just the metadata. We will be talking about the actual number of Yahoo Groups that were saved in the next post.

starfleetdoesntfirefirst:

8tracks Backup

The following is the new, up-to-date as of 1/6/2020 version of this reblog chain with extraneous and out-of-date information removed and a link to and information on the new workbook.

After 8tracks announced, with only a few days’ notice, that it was shutting down on 12/31/2019, an effort was made to preserve as many playlists as possible. The original 8tracks backup macro by VidderAdmin was downloaded over a hundred times across multiple continents, and the information from thousands of playlists was saved. Go fandom!

However, it turns out that 8tracks is staying up a bit longer—though we have no way of knowing how long—which leaves more time to save playlist information. To this end, VidderAdmin and the team that formed to work on this created a new macro workbook that fixes some issues and improves functionality.

The Updated Macro Workbook

- FIXED: playlists with Unicode producing 0kb files (and helps rerun files that failed)

- FIXED: missing images (and helps rerun files that failed)

- IMPROVED FUNCTIONALITY: helps rerun failures, allows user to choose folders to download to and to download to subfolders by fandom tag specified by user, accounts for extraneous text at the end of URLs without the user needing to ctrl+f and delete it, checks folders to ensure every text file has a matching image and reruns those without

You will need to have macros enabled to run the spreadsheet; here’s how to enable macros. Side note: genuinely friendly PSA that macros are default-disabled in Excel for a reason. Macro viruses can send and delete files and be computer-destroying levels of dangerous; always be careful what you choose to download and run!

Download:bit.ly/8tracksbackup


A screencap of the download with 8tracks Backup in the lefthand corner and below it the tabs About, Instructions, Column Definitions, and Version History. The About tab is open; text captioned below.ALT

[Rest of ID in alt text] “8tracksbackup is a macro-enabled workbook that helps to quickly download playlist metadata (including the track list) and cover artwork based on URLs that you provide. We have also included some tools to help ensure those downloads worked correctly.

We’re a team of volunteers hoping to preserve fandom history before it’s lost, but we are not affiliated with 8tracks, and we are not affiliated with Internet Archive.

While we’ve done our best to test out the macros in this workbook and address any bugs or glitches, unfortunately we can’t guarantee how it will perform, and you are using this at your own risk.

If you’re familiar with VBA, you can take a look at the code yourself, but again please know that we can’t guarantee how it will perform if you modify the code, or if you are sent a copy of this workbook that didn’t come from us.

We’ve included some instructions within these tabs, but if you have further questions, please contact us at 8tracksbackup AT gmail DOT com and we’ll do what we can to help.

If you are interested in submitting your 8tracks metadata and covers to the Internet Archive to be included in an 8tracks Fandom History Collection, please contact us at the email above by 12/31/2020.

If you’d like us to notify you when we’ve updated the workbook, or if you do not want your playlists included in our collection, you can fill out this Google Form: forms.gle/9Weh4RpKYnXFTrMQ6

Thank you for helping to save at-risk fanworks!”

So, do you need to re-run URLs you’ve already saved?

One of the main (and hardest to spot) issues in the original pre-New-Years macro was that JSON files (where the tracklists are) were coming out blank for playlists whose information included Unicode (pretty much anything not in the Roman alphabet; for example, Japanese lettering and Chinese characters).

So if you want to make sure that closer to all of the playlist URLs you ran through the macro have their tracklists saved, especially for tags/fandoms with many playlists with Unicode (for example, anime fandoms), you may want to re-run them with new macro (which also has some convenient ways of finding the errored playlists).

However, don’t despair if you won’t have time to re-run URLs; having already saved a majority is much better than nothing having been saved! (If you don’t have time, you can also share your URLs with us at the email above in case someone has time to run them on the new macro, though please know that we may not have time to get to them. Please include a note that they were already run with a previous version—thanks!)

A note for folks who sent me, starfleetdoesntfirefirst, URLs to run: As I mentioned in another post, back in the days of the first macro I was able to get to some or all of what each person sent, but may not have time to get to all of what each person sent depending on when 8tracks shuts down. However, I almost definitely won’t have time to use the new macro to re-run Unicode-containing and other errored playlists I already ran. I’ll pass them along to the rest of this team, but given that none of us can guarantee we’ll get to them before 8tracks shuts down, if you feel strongly about making sure everything you sent to me gets re-run with the new macro, it may be worth pinging your Excel-having friends for aid. (I apologize for this; I didn’t anticipate things getting to this stage rather than a quick pre-New-Years effort!)

How to record & check (some of) which 8tracks tags have already had their playlists extracted

This Dreamwidth post is a place where you can comment to record which 8tracks tags, from any fandom (or nonfandom topic like “autumn”), you have extracted the URLs from, and check which 8tracks tags others have already extracted the URLs from, to avoid unnecessarily duplicating work. (You don’t need a Dreamwidth account to comment!) Not everyone who is extracting URLs is going to know about this post, so no guarantees of avoiding duplication, but it’s a start. :)

How to extract playlist URLs for use with the workbook

1) Do a tag search on 8tracks (or go to your own “liked” or “already listened” page if those are the playlists you want to save). Scroll to the very bottom of the page (so that all playlists have loaded and actually appeared on the page).


The bottom of the search page for the tag "nyota uhura" with all playlists loaded.ALT

2) Use a link-extracter plugin (like this one for Chrome) to extract all the links on the page.

A link extraction showing the first of several hundred links, many of which are random 8tracks links, not playlists.ALT

3) Filter for links with the word “play” in them. This will pull up only the actual playlists. (There were 54 playlists tagged “Nyota Uhura,” and as you can see in the screencap, adding the filter “play” gives exactly 54 results.)

The same link extraction with "play" in the filter field and only playlist results.ALT

4) Copy into spreadsheet.

Note on Archive Team’s effort

Archive Team is also attempting to archive as many playlists as they can, by random number rather than by tag due to 8tracks’s 1000-playlist limit in tag search. These will probably be stored on the Internet Archive as WARC files, which means that it will be harder for non-power-users to access playlist info, thus our continuing this separate effort.

Per@meeedeee: As a result they dropped trying to archive by tags, instead they’re running 3.5 million random numbers in the hopes of grabbing what they can. On the back end of 8tracks, the playlist data is not stored by URL, or by tag, but by unique numeric identifier. Because of the volume that they have to run, the Archive Team will not be archiving the “look and feel” (user profile icons). Only basic metadata and the raw cover art.

8.13.21 | Wrapped up 3 research texts this week and I’m pretty happy with it. Celebrate with wine al8.13.21 | Wrapped up 3 research texts this week and I’m pretty happy with it. Celebrate with wine al8.13.21 | Wrapped up 3 research texts this week and I’m pretty happy with it. Celebrate with wine al8.13.21 | Wrapped up 3 research texts this week and I’m pretty happy with it. Celebrate with wine al

8.13.21 | Wrapped up 3 research texts this week and I’m pretty happy with it. Celebrate with wine always. Library work is wonderful, but archives are chilly in summer! I’m itching to get back to museum work though, I miss it badly. And I still have to edit my a/v research paper but am dragging my feet. Oh well.


Post link
7.16.21 | Things are good! Things are paying off! I have a 3.8 and a 3.9 in my dual degrees masters 7.16.21 | Things are good! Things are paying off! I have a 3.8 and a 3.9 in my dual degrees masters

7.16.21 | Things are good! Things are paying off! I have a 3.8 and a 3.9 in my dual degrees masters program, I just nailed 2 interviews in both job fields I’m aiming for at premier institutions, with one more next week. My thesis is outlined, research fitting my theory. Just celebrated a friend’s birthday, my parents are coming to the city next week. Life’s just good.

If anyone’s feeling burnt out, please know that the effort you put in does pay off!


Post link
loading