“NARA will block access to commercial ChatGPT on NARANet [an internal network] and on NARA issued laptops, tablets, desktop computers, and mobile phones beginning May 6, 2024,” an email sent to all employees, and seen by 404 Media, reads. “NARA is taking this action to protect our data from security threats associated with use of ChatGPT.”
The move is particularly notable considering that this directive is coming from, well, the National Archives, whose job is to keep an accurate historical record. The email explaining the ban says the agency is particularly concerned with internal government data being incorporated into ChatGPT and leaking through its services."
I have an org file for a long-running project. It's getting hard to manage because there are lots of different tasks, events, etc.
I think I want to create an "archive version" of that file, which would have the same structure but store items, say, with a timestamp older than 2 months. That would require two basic steps:
extracting a subtree from the original file;
merging the extracted subtree into the archived version.
I could implement that, but I wonder if there is any existing way for that? Or some other approach that would address the same issue?
Thanks Amy @grinn for pointing me to the necessary pieces of org-refile! It would have taken much longer to figure out otherwise.
I've made a function that org-refiles the entry at point into "archive/<file-name>.org", preserving the header structure. I only had to implement creating nonexistent headers because `org-refile' can create just one level out-of-the-box.
And another function that performs that operation on all entries found by `org-ql'.
Checked my 6,921 bookmarks on Pinboard.in: 3,462 hit dead ends with 404s or expired domains, and many of the 3,459 left show fake content or parking pages. Only 21% from the last 2 years still work as expected. The lifespan of URLs is definitely shrinking.
#Archiving#AcademicPublishing#DigitalPreservation: "When Eve broke down the results by publisher, less than 1 percent of the 204 publishers had put the majority of their content into multiple archives. (The cutoff was 75 percent of their content in three or more archives.) Fewer than 10 percent had put more than half their content in at least two archives. And a full third seemed to be doing no organized archiving at all.
At the individual publication level, under 60 percent were present in at least one archive, and over a quarter didn't appear to be in any of the archives at all. (Another 14 percent were published too recently to have been archived or had incomplete records.)
The good news is that large academic publishers appear to be reasonably good about getting things into archives; most of the unarchived issues stem from smaller publishers.
Eve acknowledges that the study has limits, primarily in that there may be additional archives he hasn't checked. There are some prominent dark archives that he didn't have access to, as well as things like Sci-hub, which violates copyright in order to make material from for-profit publishers available to the public. Finally, individual publishers may have their own archiving system in place that could keep publications from disappearing."
So uh. Best software to rip DVDs? I've tried with VLC, but it spent half an hour going through the entire 2 hour movie and then rendered only the 8 second intro to file 😬
I don't need the whole menu and all, but I need to be able to get the video, the right audio track, and the right subtitle track. I've got a bunch of old DVDs here, 10+ years old and sometimes more, that I'd like to archive before bitrot sets in.
Occasional reminder that the Internet Archive provides a number of tools and browser plugins to let you send pages to the Wayback Machine (as well as check if a given page has been saved):
Harvard Library Innovation Lab: WARC-GPT: An Open-Source Tool for Exploring Web Archives Using AI
"...an open-source, highly-customizable Retrieval Augmented Generation tool the web archiving community can use to explore the intersection between web #archiving and #AI. WARC-GPT allows for creating custom chatbots that use a set of #web#archive files as their knowledge base, letting users explore collections through conversation." 👏
What’s the Value of 3 Million LPs in a Digital World? Easy! They can be Played still in 50+ Years’ Time!
The ARChive of Contemporary Music has one of the largest collections of vinyl records in the world and is in danger of losing its home. Its champions are making a case for the future of physical media.
If someplace like a university starts a digitization p ...continues
I've been given official blessing/permission to archive and upload historical versions of FUZE BASIC - Do any coders/data hoarders happen to have SD images of versions of FUZE BASIC (any Raspi model version) prior to version 3.4.0 ? I'd love to add them to the archive. #raspberrypi#preservation#archiving#coding
#SocialMedia#Twitter#Musk#APIs#Archiving#Disinformation: "When Elon Musk began requiring people to pay steep fees to access the Twitter API earlier this year, he broke a series of tools used by researchers and archivists that could be used to accurately save tweets with metadata. We are now in a situation where the best way to archive “official” information on Twitter in a rapidly changing war is to take screenshots of deleted tweets, which can be faked and may leave out potentially very important metadata, such as what location and device the tweet was posted from, specific timestamps, and unique tweet identifiers that can be used to find the tweet again later. Screenshotting things is also an incredibly inefficient, manual, and ad-hoc way of preserving anything.
“People can fake screenshots,” Miles McCain, the founder of PolitiTweet, told me. “It’s not a trusted form of archival in any way, unless you have hundreds of people with different versions of the same screenshot. I think having a trusted, reliable archive is incredibly important.”"
Good Creative Commons licensed music from the 2000s and 2010s is disappearing behind digital fences: archive.org hasn't backed up everything and Jamendo (the main site for CC music at that time) is requesting a signup for the "free download".
This just shows that centralised private services cannot be trusted for archiving the cultural heritage of mankind. All of Youtube could be gone tomorrow as well.
For the people who would love to save copies of their posts, so they can post it elsewhere (or have a backup archive in addition to the one Mastodon gives you), I recommend using Mastodon Content Mover.
I have tested it and while its function is to repost all your posts from your old instance to your new one, it also tickles my archiving bone as you can also use it to save all your posts.
PLEASE READ THE "WHAT IT CAN'T DO" SECTION THOUGH. You, you kinda gotta.
The way it will work is that you have to right click on the folder you've named MastodonContentMover, copy/paste the instructions for the terminal that will pop up, and then let it save. By the end, you will have a folder for each post you did, marked with dates. As the site says, you can also save each posts with hashtags, check the public posting to private, etc.
Not only will it save your images that each post may have had, it will save your text posts in a xml format. You can easily open these with Notepad, Notepad++, VS Code, etc. The image attached is an example of one of my posts when you open the thing.
I have not unfortunately tested the posting part because that would be a lot of posts for me...but give it a try if you think it'll work for you :v
#MUSIC#IDENTIFICATION HELP WANTED!
I've finally got transferred an open reel I bought from a recycling shop in #Toronto , ON, many many years ago.
Unfortunately I have no idea the artist, album, or song names!
Do you recognize any of the music, vocals or instrumentation in this sample I've prepared?
If you can subscribe monthly at any amount, it would be a huge help! I lost quite a bit of subscribers over the recession & my patreon is my primary source of income.
Some folks on mastodon delete posts after a period, sometimes for privacy sometimes to save server space.
Is there a nice way to download a thread/archive it?
I don’t want to distribute them, I think there’s two cases:
1 I like having an archive of stuff I’ve said so I can look at it years later.
2 people have good advice/essays and I’d like to read them in the future.