| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: application/rss+xml; charset=utf-8
last-modified: Thu, 14 Nov 2024 22:18:18 GMT
access-control-allow-origin: *
etag: W/"6736772a-26be"
expires: Tue, 20 Jan 2026 22:14:46 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: B721:1989E9:56FCC:626E7:696FFBFD
accept-ranges: bytes
age: 0
date: Tue, 20 Jan 2026 22:04:47 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210068-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1768946687.847835,VS0,VE230
vary: Accept-Encoding
x-fastly-request-id: 31b52cda8a7f53b89c3ed08d75406a98d3fdbd87
content-length: 3285
Ted Lawless
https://lawlesst.github.io/
Work notebook
Thu, 14 Nov 24 17:17:38 EST
https://blogs.law.harvard.edu/tech/rss
-
Creating Spotify Playlists via the Spotify API
I've recently pushed code to Github for a little hobby project I've been working. There are public radio music shows that I enjoy but am often not able to listen to because they are aired at a time that I'm busy. It's more convenient to listen on demand...
Sun, 20 Aug 23 00:00:00 EST
https://lawlesst.github.io/notebook/spotify-playlists.html
https://lawlesst.github.io/notebook/spotify-playlists.html
-
Importing Python code by file reference and inspecting classes
I recently had a use case for importing Python code by reference to its full file path, inspecting each class found within the source, and performing a task if a particular attribute was found. This was for a command-line interface (CLI) and I wanted the script to be run like the following:
$ python inspect_source.py /full/path/to/source.py
My initial scan of stackoverflow answers didn't provide a solution for exactly what I wanted to do in this case. So I turned to my other favorite resource, Python 3 Module of the Week, and read up on the standard library's importlib and inspect modules...
Fri, 18 Aug 23 00:00:00 EST
https://lawlesst.github.io/notebook/python-class-inspection.html
https://lawlesst.github.io/notebook/python-class-inspection.html
-
Fix the docs - SQL Server and Docker on a Mac
I recently setup Microsoft SQL Server on a Macbook and found a simple error in Microsoft's documentation. Since I didn't quickly find an answer in the normal places (Google -> Stack Overflow), I thought I would post it here in case it saves someone a few minutes. From Microsoft's "Get started with SQL Server" documentation, which is otherwise quite good and nicely organized, Step 1.1 ((https://www.microsoft.com/en-us/sql-server/developer-get-started/python/mac/)) lists two commands for pulling a Docker image from Docker Hub and running it...
Fri, 07 Jan 22 00:00:00 EST
https://lawlesst.github.io/notebook/mssql-server-docker.html
https://lawlesst.github.io/notebook/mssql-server-docker.html
-
Automatically extracting keyphrases from text
I've posted an explainer/guide to how we are automatically extracting keyphrases for Constellate, a new text analytics service from JSTOR and Portico.
We are defining keyphrases as up to three word phrases that are key, or important, to the overall subject matter of the document. Keyphrase is often used interchangeably with keywords, but we are opting to use the former since it's more descriptive. We did a fair amount of reading to grasp prior art in this area, extracting keyphrases is a long standing research topic in information retrieval and natural language processing, and ended up developing a custom solution based on term frequency in the Constellate corpus...
Mon, 19 Jul 21 00:00:00 EST
https://lawlesst.github.io/notebook/constellate-keyphrases.html
https://lawlesst.github.io/notebook/constellate-keyphrases.html
-
Datasette hosting costs
I've been hosting a Datasette (https://baseballdb.lawlesst.net, aka baseballdb) of historical baseball data for a few years and the last year or so it has been hosted on Google Cloud Run. I thought I would share my hosting costs for 2020 as a point of reference for others who might be interested in running a Datasette but aren't sure how much it may cost.
The total hosting cost on Google Cloud Run for 2020 for the baseballdb was $51.31, or a monthly average of about $4.28 USD. The monthly bill did vary a fair amount from as high as $13 in May to as low as $2 in March...
Sat, 16 Jan 21 00:00:00 EST
https://lawlesst.github.io/notebook/datasette-hosting.html
https://lawlesst.github.io/notebook/datasette-hosting.html
-
Connecting Python's RDFLib to AWS Neptune
I've written previously about using Python's RDFLib to connect to various triple stores. For a current project, I'm using Amazon Neptune as a triple store and the RDFLib SPARQLStore implemenation did not work out of the box. I thought I would share my solution.
The problem
Neptune returns ntriples by default and RDFLib, by default in version 4.2.2, is expecting CONSTRUCT queries to return RDF/XML...
Fri, 15 Mar 19 00:00:00 EST
https://lawlesst.github.io/notebook/rdflib-neptune.html
https://lawlesst.github.io/notebook/rdflib-neptune.html
-
Usable sample researcher profile data
I've published a small set of web harvesting scripts to fetch information about researchers and their activities from the NIH Intramural Research Program website.
On various projects I've been involved with, it has been difficult to acquire usable sample, or test data, about researchers and their activities. You either need access to a HR system and a research information system (for the activities) or create mock data. Mock, or fake data, doesn't work well when you want to start integrating information across systems or develop tools to find new publications...
Sat, 19 May 18 00:00:00 EST
https://lawlesst.github.io/notebook/researcher-profile-data.html
https://lawlesst.github.io/notebook/researcher-profile-data.html
-
Exploring 10 years of the New Yorker Fiction Podcast with Wikidata
Note: The online Datasette that supported the sample queries below is no longer available. The raw data is at: https://github.com/lawlesst/new-yorker-fiction-podcast-data.
The New Yorker Fiction Podcast recently celebrated its ten year anniversary. For those of you not familiar, this is a monthly podcast hosted by New Yorker fiction editor Deborah Treisman where a writer who has published a short story in the New Yorker selects a favorite story from the magazine's archive and reads and discusses it on the podcast with Treissman.1
I've been a regular listener to the podcast since it started in 2007 and thought it would be fun to look a little deeper at who has been invited to read and what authors they selected to read and discuss.
The New Yorker posts all episodes of the Fiction podcast on their website in nice clean, browseable HTML pages...
Tue, 06 Feb 18 00:00:00 EST
https://lawlesst.github.io/notebook/nyer-fiction.html
https://lawlesst.github.io/notebook/nyer-fiction.html
-
Now Publishing Complete Lahman Baseball Database with Datasette
Summary: The Datasette API available at https://baseballdb.lawlesst.net now contains the full Lahman Baseball Database.
In a previous post, I described how I'm using Datasette to publish a subset of the Lahman Baseball Database. At that time, I only published three of the 27 tables available in the database. I've since expanded that Datasette API to include the complete Baseball Database.
The process for this was quite straightforward...
Sun, 03 Dec 17 00:00:00 EST
https://lawlesst.github.io/notebook/baseball-datasette-full.html
https://lawlesst.github.io/notebook/baseball-datasette-full.html
-
Publishing the Lahman Baseball Database with Datasette
Summary: publishing the Lahman Baseball Database with Datasette. API available at https://baseballdb.lawlesst.net.
For those of us interested in open data, an exciting new tool was released this month. It's by Simon Willison and called Datasette...
Mon, 20 Nov 17 00:00:00 EST
https://lawlesst.github.io/notebook/baseball-datasette.html
https://lawlesst.github.io/notebook/baseball-datasette.html