Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweets in outlines ==> encoding problem #6

Open
scripting opened this issue Feb 12, 2022 · 0 comments
Open

Tweets in outlines ==> encoding problem #6

scripting opened this issue Feb 12, 2022 · 0 comments

Comments

@scripting
Copy link
Owner

scripting commented Feb 12, 2022

I spent a half-day tracking down a problem that I want to document here.

Outlines are by default encoded as ISO-8859-1. I don't know why, but that's what the code says, and most of my OPML-creating code does that.

But Twitter's API does everything in UTF-8. So when you include text that came from Twitter in an, when you save it, it will create an OPML file that has garbled text where the tweets are. Not a lot of it, which is why we've gotten this far without dealing with it. But enough to make this something that has to be dealt with.

I think what it's going to mean is converting tweet text to UTF-8 as it is inserted into an outline.

Background

The davetwitter package already deals with this in a clumsy way. It changes the content type of the OPML file to UTF-8.

It's not bad in its case when the purpose of the code is to create an outline of tweets, but it isn't a general answer because sometimes you include a tweet in an outline with lots of other text that doesn't come from Twitter.

function tweetsToOpml (screenname, theDay, theTweets) {
	var opmltext = opml.stringify (sortTweets (theTweets));
	opmltext = utils.replaceAll (opmltext, "ISO-8859-1", "UTF-8");
	saveToArchive (screenname, theDay, opmltext);
	return (opmltext);
	}

Today's problem

I am working on tweets.opml.org right now, and this encoding issue shows up there, and I want to fix it.

The clumsy fix above works there too, since the outlines only contain tweets, setting the encoding for the XML to UTF makes everything look nice. An hour ago this was a vexing problem, now it's nicely solved.

But the general problem still exists. I need a simple bit of JS code that converts a bit of UTF-encoded text to ISO-8859-1.

@scripting scripting changed the title Including tweets in outlines ==> encoding problem Tweets in outlines ==> encoding problem Feb 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant