Introducing the logs

Thursday, 21 November, Year 11 d.Tr. | Author: Mircea Popescu

As you might perhaps remember, a few weeks ago I commissioned Eric Benevides to produce a sort of meld, between Stanislav Datskovskiy's ad-hoc python-based IRC logger bot he made for #trilema the irc channel during the previous crisis and the MP-WP LAMP-based web publishing mechanism supporting Trilema the blog. As I take it delivery is imminent, so there's some preparatory changes needed here, chief among which this new Logs category. It will contain, as you might expect, one new article each day, reflecting that day's ongoing going ons in #trilema, the IRC channel. So there we go, that's all good & introduced, hurray for the A-team.

The other item needing an introduction is the IRC archive dump. As promised I sat down to splice together a complete and definitive version of the IRC logs ; the result is a body made of parts.

The first part comes from Freenode-#bitcoin-otc.log, a 15376633 byte file bearing a last-mod timestamp of Tue 22 Jul 2014 05:20:43 PM CST in my archives. While it's a historical fact that my public involvement with Bitcoin starts mid-2011, it is also a historical fact that I did not deem the whole IRC milieu important enough to keep archives until the beginning of 2012. If there exist earlier logs, I will make no effort to establish their authenticity, due to a belief that such efforts can not be successful to a degree of permitting detrimental reliance on their form, in which case what the hell's the point.

These early proto-logs run daily (with interruptions, logging was not intended to work in my absence in those early daysⁱ) until August 11th 2012. Exactly two fragments have been elided for timeline conflict with the flow of historyⁱⁱ. The first oneⁱⁱⁱ came because I don't now remember who argued importantly in the manner of Internet dweebs that the reason I wasn't on #bitcoin-otc was really that I had been banned, and not personal preference, so I went and tested his theory. It turned out I hadn't been banned and the anon tard returned to the background bogonic radiation, no doubt making strong claims with no basis still, to this very day. The second^iv was part and parcel of the ample yet allegedly absent^v body of evidence as to the identity of the party on top. Ever heard the self-obvious observation whereby democracy is a dubious social form owing to the happenstance that nine man gangrapes are enjoyable to nine tenths of the participants, while the tenth's underage anyway ? No, making men illegal won't change this, it'll just wreck the "rule of law" fiction (not that it's rescuable, owing to the cognitive death of the contract).

A portion was also introduced (from #bitcoin-otc-eu^vi), to cover the interval between April 8th and May 11th 2012^vii. Throughout the Summer of 2012 committment remained weak on my part for any IRC channel in general speaking, so it's ambiguously unclear what relevancy either of these two could possibly claim for themselves, or whether the timeline should follow #bitcoin-otc.log or #bitcoin-otc-eu.log until August 10th 2012. They both fell off a cliff afterwards anyways, and the exacting attempt at a reconstruction of something out of disparate parts of nothing is too much like everyday work for the mind to relish low payoff applications upon historical detritus.

The last part comes from my copy of the bitcoin-assets.log as I handed it off to phf for publishing, and so that'll be all.

In the end this timeline comes to a little over two million seventy thousand lines accreted during the 1538 days between 12 January 2012 and 28 March 2016 ; a grand total of 160 or so Megabytes. People were a lot chattier then, but for little reason and even less benefit -- I indeed doubt there's enough material of actual substance in all that to produce a coupla thousand lines fit for inclusion in contemporaneous logs. Nevertheless, for historicity's sake the whole bundle's going to appear deluvionally right here, on irony's last remaining bastion, just as soon as I'm done formatting it for sql and importing the whole pile. You're... welcome, I guess.

———

In fact, due to the nefarious influence of morons-in-aggregate, we didn't even necessarily think public logs are needed or important back then ; this is still the situation in that first sad remnant of my natural development.

There are two morals in this footnote. One is that idiots in organized groups are universally evil in a moral sense and occasionally effectual at being evil in an ethical sense, thus therefore never permit morons to live lest they gather up and ruin the world while you're not looking -- you can't rely on yourself to know when, in some obscure, seemingly at the time unimportant angle of your life their idiocy is still as idiotic as it ever is, but not particularly obvious to you. The other is that the shells I've left behind, through overgrowing them, are numerous indeed -- just as you'd not have likely thought "oh, my personal experience's not that important -- #bitcoin-otc happened before" don't think "oh, #bitcoin-otc is how it all begun". It's not, I overgrow things, starting from childhood, as part and parcel of what their failure to keep up with me means. [↩]
Which is to say this channel, that had once been the forum, meanwhile a ruined fane, is only capable of yielding occasional footnotes. [↩]
**** BEGIN LOGGING AT Sat Oct 13 20:40:32 2012

Oct 13 20:40:32 * Now talking on #bitcoin-otc
Oct 13 20:40:32 * Topic for #bitcoin-otc is: OTC marketplace for Bitcoin trading and exchange. | http://bitcoin-otc.com/ | http://bit.ly/x56Fd5 | Start with the ;;guide | Before trading, talk to people, and check ratings. | Beware the FRAUDSTERS. ;;fraud | Trade Options: http://polimedia.us/btc | BTC → MP @ http://btcpak.com | Logs: http://bit.ly/NFg1Yy | Support bitcoin-otc! 1F1dPZxdxVVigpGdsafnZ3cFBdMGDADFDe
Oct 13 20:40:32 * Topic for #bitcoin-otc set by nanotube at Tue Jul 31 07:10:19 2012
Oct 13 20:40:32 -ChanServ- [#bitcoin-otc] Welcome! If you're new here, please read the channel rules: http://wiki.bitcoin-otc.com/wiki/Bitcoin-otc_channel_guidelines as well as the OTC systems user guide: http://wiki.bitcoin-otc.com/wiki/Using_bitcoin-otc
**** ENDING LOGGING AT Sat Oct 13 20:40:35 2012

[↩]
**** BEGIN LOGGING AT Wed Jul 23 02:20:27 2014

Jul 23 02:20:27 * Now talking on #bitcoin-otc
Jul 23 02:20:27 * Topic for #bitcoin-otc is: OTC marketplace for Bitcoin trading and exchange. | http://bitcoin-otc.com/ | http://bit.ly/x56Fd5 | Start with the ;;guide | Before trading, talk to people, and check ratings. | Beware the FRAUDSTERS ;;fraud | VPS for BTC @ vertvps.com code SUMMEROFOTC for 50% off first invoice | Logs: http://bit.ly/NFg1Yy | Support bitcoin-otc! 1F1dPZxdxVVigpGdsafnZ3cFBdMGDADFDe
Jul 23 02:20:27 * Topic for #bitcoin-otc set by nanotube at Tue Jun 10 00:39:32 2014
Jul 23 02:20:27 * [freenode-info] channel flooding and no channel staff around to help? Please check with freenode support: http://freenode.net/faq.shtml#gettinghelp
Jul 23 02:20:27 -ChanServ- [#bitcoin-otc] Welcome! If you're new here, please read the channel rules: http://wiki.bitcoin-otc.com/wiki/Bitcoin-otc_channel_guidelines as well as the OTC systems user guide: http://wiki.bitcoin-otc.com/wiki/Using_bitcoin-otc
Jul 23 02:20:27 * gribble gives voice to mircea_popescu
Jul 23 02:20:31 mircea_popescu ;;view 21136
Jul 23 02:20:33 * rchasman has quit (Client Quit)
Jul 23 02:20:34 gribble #21136 Tue Jul 22 19:19:11 2014 mircea_popescu SELL 1000000.0 Doge coins deliverable 3 or 6 months @ 3 BTC (Or less, negotiable by volume and interval. Looking to carry a few billion.)
Jul 23 02:20:35 mircea_popescu ;;view 21135
Jul 23 02:20:35 Ex0deus ;;tlast
Jul 23 02:20:35 gribble 620.0
Jul 23 02:20:38 gribble #21135 Tue Jul 22 18:54:52 2014 mircea_popescu SELL 5000.0 ethereum coins deliverable March 15th, 2015 @ 1 BTC (Up to 1k BTC's worth accepted. Get in touch.)
Jul 23 02:20:41 mircea_popescu there we go.
**** ENDING LOGGING AT Wed Jul 23 02:20:44 2014

[↩]
Yet to quote some random anodyne cuck, "the mystique is stronger than ever". [↩]
The move to #bitcoin-assets followed from here, kakobrekla owned the bitcoin-otc subordinate namespace -eu, and I just proposed we make a proper one. [↩]
The only part spliced out from the beginning of the #bitcoin-otc-eu log is this fragment from April 2nd :

**** BEGIN LOGGING AT Mon Apr 2 01:58:08 2012

Apr 02 01:58:08 * Now talking on #bitcoin-otc-eu
Apr 02 01:58:08 * Topic for #bitcoin-otc-eu is: Eurozone #bitcoin-otc || http://bitcoin-otc.com || Include hash tag #eu in order notes to group -eu orders. || View all tagged -eu orders here: http://bitcoin-otc.com/vieworderbook.php?notes=%23eu || Exchange rates: !bc,convert CURRENCYCODE || GET BTC WITH ukash/paysafecard ... /msg neliskybot help
Apr 02 01:58:08 * Topic for #bitcoin-otc-eu set by kakobrekla!~T42@89-212-41-49.static.t-2.net at Sat Jan 28 15:22:07 2012
Apr 02 01:59:45 mircea_popescu ;;rate gmaxwell -1 I don't trust him.
Apr 02 01:59:46 gribble Rating entry successful. Your rating of -1 for user gmaxwell has been recorded.
Apr 02 02:52:26 * allied has quit (Ping timeout: 252 seconds)
Apr 02 02:55:03 * allied (~allied2@213.229.88.72) has joined #bitcoin-otc-eu
**** ENDING LOGGING AT Mon Apr 2 03:02:35 2012

Yes, that's right, back then ratings didn't take any crypto verification, the whole system relied on Freenode's namespace control.

Actually, in the cesspool left behind, it... still does. That's right, to this very day, the pretenders to relevancy pretend to relevancy, and just as idly as ever before. [↩]

« The broken windowpane fallacy fallacy

Forum logs for 12 Jan 2012 »

Category: Logs

Comments feed : RSS 2.0. Leave your own comment below, or send a trackback.

40 Responses

1
Mircea Popescu
Thursday, 21 November 2019

The xchat log format is maddeningly idiotic. Consider this line :

Jan 12 03:27:11 <pigeons> yay freddieisdead!

The format is : month SPACE day SPACE hour COLUMN minute COLUMN second SPACE< name >TAB text. How the fuck's anyone supposed to extract anything from this idiocy ? Especially considering space's fucking ambiguously also the most common character in text (and all the others also occur, so you can't even awk -F " |:" because then you've lost some data) ?! Pshaw.

Oh, and of course, let's not forget -- use the universal BRE/ERE/PCRE globbing character as a MULTIPLE INSTANCE tag, it'll be such a pleasure to grep for the literal string ****.

Also worth a mention, the deeply deranged handling of categories on MP-WP, which makes it preferable to dump everyting into the default "Zilnic" and fix it later, rather than harangue immense complexity into shape as we go.

And, of course, we can't just use printf -- it's a bitch about "awk: (FILENAME=- FNR=236) fatal: not enough arguments to satisfy format string ^ ran out for this one" shenanigans ; so we're stuck with dumbass print, which sticks an ORS after each invocation, so we're stuck nulling ORS and so on and so forth for fucking ever. Why is the format for setting the IRS so different for the format setting the ORS ? Nooobody knowsss...

Anyways, here's most of the way in bash :

cat logstory.txt | grep -v '\*\*\*\* BEGI' | grep -v '\*\*\*\* ENDI' | grep . | awk -F " " -v ORS="" '{sqltemplate="\");\nINSERT INTO tril_posts(post_author, post_status, post_type, post_date, post_date_gmt, post_title, post_content) VALUES (1, \"publish\", \"future\",\""; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time= substr($3,0,5); ; sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){print sqltemplate}; day=$2; print "<tr><td>"$4"</td><td>"text"</td><td><font color=gray>["time"]</font></td></tr>"}' > logs.txt

It outputs correct sql, as far as I know -- except for the part where it's a 548 character bash pipe monstrosity, thus well past any possible maintainability threshold, and it ~still~ doesn't do the date part. I want each of the ~1500 entries to be incremented by time, such that they end up published once a minute or something ; but theres no native "add 50 seconds to this timestamp" in awk, and if I end up writing one might as well make this a proper fucking program already.
2
Eric Benevides
Thursday, 21 November 2019

Ugh, that does indeed seem annoying.

Unfortunately, the archive of logs I'm importing came from an extract of the data in the Stan logger (which only goes back to 2016).

As for how the archives got into the Stan logger in the first place, Diana Coman wrote a converter for irssi logs, and I wrote one for znc logs.

All this to say I have not yet wrestled with xchat log formats, though I'm fine to take a poke at it. Data conversion never ends...

As for the maddening process of assigning the category, fwiw in the mp-wp bot code I indeed ended up doing this as a multi-step thing. Here's the relevant snippet from the vpatch if it helps you any. Essentially insert record into posts table, grab post_id, then do the dance with the various relationship tables.
3
Mircea Popescu
Thursday, 21 November 2019

> Unfortunately, the archive of logs I'm importing came from an extract of the data in the Stan logger

Nevermind that part. How does it do the data to sql part ?
4
Eric Benevides
Thursday, 21 November 2019

Right now I have it coded in python. It reads a formatted dump line by line comparing the timestamp on current line to previous line. If the current line is a new day, it creates a new article, otherwise it just updates the article for that day (and adds the new line to the data in the post_contents field).

It also applies all of the various functions such as adding the timestamp delimiters, etc.

I'm at 'saltmine' atm and commenting from dumbphone, but should be back at my terminal in ~4 hours. I will ping you in chan then and can provide various specifics as you require 'em (or obv. just ask here and I'll provide once back).
5
Mircea Popescu
Thursday, 21 November 2019

Thanks ; if I'm asleep by then we just sync tomorrow or during the weekend.
6
Mircea Popescu
Saturday, 23 November 2019

In other lulz of the united open front : %s "obviously" means unixtime in date (and it obviously has to come in the format +"%s" because duh) while it just as "obviously" means command-line supplied parameter in awk. Because why not, right, nothing's more obvious than the obvious stupidity of obviousness.

Anyways, it occurs to the suffering agent that if awk dun do time, mysql does, so just let it. Add to that various fixes, trims and whatnot (especially ux refinements provided directly through teh everholy comment section), and the definitive script as actually used turns out to :

cat logstory.txt | grep -v '\*\*\*\* BEGI' | grep -v '\*\*\*\* ENDI' | grep . | sed 's%"%\\"%g' | awk -F " " -v ORS="" -v timecnt=1574519743 -v curryear=2012 -v articleid=89175 '{refcnt++;if($1=="Jan" && month=="Dec"){curryear++};title="Forum logs for "$2" "$1" "curryear;slug=tolower(title);gsub(" ","-",slug);sqltemplate="</table>\"); INSERT INTO tril_term_relationships (object_id, term_taxonomy_id) VALUES ("articleid-1", 44);\nINSERT INTO tril_posts(id, post_author, post_status, post_type, post_name, post_date, post_date_gmt, post_title, post_content) VALUES ("articleid",1, \"publish\", \"post\",\""slug"\", FROM_UNIXTIME("timecnt+7200"), FROM_UNIXTIME("timecnt"),\""title"\",\"<table style=\\\"font-size:1em;\\\">"; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time=substr($3,0,5); sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){timecnt=timecnt+57;print sqltemplate;articleid++;};print "<tr><td>"$4"</td><td>"; if($4=="*"){print "<span style=\\\"color:#d3d3d3;\\\">"};print text;if($4=="*"){print "</span>"};print"</td><td><span style=\\\"color:#d3d3d3;\\\">[<a id=\\\""refcnt"\\\" href=\\\"http://trilema.com/2019/"slug"#"refcnt"\\\">"time"</a>]</span></td></tr>";day=$2;month=$1;}' > logs.txt

Ain't that a great way to spend 1233 characters!

I did spend a while mired in a whole pile of

a:3:{i:1574522448;a:1:{s:19:"publish_future_post";a:1:{s:32:"9cf6394663d7edb1f3bc55e81178f572";a:2:{s:8:"schedule";b:0;s:4:"args";a:1:{i:0;s:5:"89104";}}}}i:1574526765;a:2:{s:17:"wp_update_plugins";a:1:{s:32:"40cd750bba9870f18aada2478b24840a";a:3:{s:8:"schedule";s:10:"twicedaily";s:4:"args";a:0:{}s:8:"interval";i:43200;}}s:16:"wp_update_themes";a:1:{s:32:"40cd750bba9870f18aada2478b24840a";a:3:{s:8:"schedule";s:10:"twicedaily";s:4:"args";a:0:{}s:8:"interval";i:43200;}}}s:7:"version";i:2;}

(from the options table) trying to get the mpwp native publishing scheduler work, but in the end it does not. It's a gnarly format, that comes to a (for array) column 3 (element count). i denotes an integer, s denotes a string, so s:6:"string" would be how you say string. The hash in the example above is actually md5 (yes, lol), serving in the role of poor man's gensym. Yet even with all that, there's another missing piece somewhere, because merely inserting a formed array into the options does not trigger the future publishing whereas producing a future article through the interface and then switching out its id here for your "missed deadline" article WILL publish the sql-inserted article by that new id rather than the interface-produced article. (Also, simply wiping the "twicedaily" wp_update_plugins/wp_update_themes bullcrap results in this php reimplementation of cron used by mpwp simply reinterpreting your hook as a subhook, in a reconstructed hook structure. It'd prolly be good if the driver of this behaviour is found and eviscerated, but I'm guessing it'll have to wait for billymg getting round to it.)

In any case, I gave up trying to get MPWPs future function to do this, instead I just used a sleep in bash :

cat logs.txt | while read -r line; do echo $line | mysql --host=localhost --user=etc; sleep 57; done

And that was the end of it. Expect normal service to be restored tomorrow, go hunt or something for a day. Ta-da!

PS. I confess it's pleasant to see things like

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18076 * 20 0 103m 1248 1000 R 98.0 0.0 0:43.55 awk

you know ?
7
Mircea Popescu
Sunday, 24 November 2019

Well, the above quoted might've been the script "actually used" to... start the publishing with. The script publishing ended with, however, is slightly different :

cat logstory.txt | grep -v "^\*" | grep -v "^ " | grep . | sed 's%"%\\"%g' | sed 's%*%\&ast;%g' | awk -F " " -v ORS="" -v timecnt=1574535743 -v curryear=2012 -v articleid=89188 '{refcnt++;if($1=="Jan" && month=="Dec"){curryear++};title="Forum logs for "$2" "$1" "curryear;slug=tolower(title);gsub(" ","-",slug);sqltemplate="</table>\"); INSERT INTO tril_term_relationships (object_id, term_taxonomy_id) VALUES ("articleid-1", 44);\nINSERT INTO tril_posts(id, post_author, post_status, post_type, post_name, post_date, post_date_gmt, post_title, post_content) VALUES ("articleid",1, \"publish\", \"post\",\""slug"\", FROM_UNIXTIME("timecnt+7200"), FROM_UNIXTIME("timecnt"),\""title"\",\"<table style=\\\"font-size:1em;width:580px;\\\">"; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time=substr($3,0,5); sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){timecnt=timecnt+57;print sqltemplate;articleid++;};print "<tr><td>"$4"</td><td class=\\\"breakup\\\">"; if($4=="*"){print "<span style=\\\"color:#d3d3d3;\\\">"};print text;if($4=="*"){print "</span>"};print"</td><td><span style=\\\"color:#d3d3d3;\\\">[<a id=\\\""refcnt"\\\" href=\\\"http://trilema.com/2019/"slug"#"refcnt"\\\">"time"</a>]</span></td></tr>";day=$2;month=$1;}' > logs4.txt

Particularly valuable in there (other than the implicit major version count) is the magic bullet for the nasty overflow issue (no, the first fix didn't fix it all -- there's a different driver for the observed misbehaviour in very long no-space strings such are common in, well, a crypto context). It turns out the following peculiar gymnastics resolve it : setting the table to an explicit width (58o px in the case of Trilema); setting the middle column to "breakup" class; and finally adding a

.breakup { word-break: break-all;}

line to the style.css for the theme.

PS. What's lovelier than typing ctrl-6 alt-\ ctrl-k into nano within a second and change, only for it to eat a minute and over to execute the orders. I really love overwhelming machines!!!

PPS. Apparently sed 's%*%&ast;%g' does not replace each instance of the * character with the &ast; htmlentity -- because why the fuck would it! Isn't this after all the #1 reason everyone loves the sed replace format, that you can use whatever characters you wish for the separators, and therefore don't have to worry so much about escaping the content ? Fucking 'ell.
8
Eric Benevides
Monday, 2 December 2019

I want to address this morning's Qs but am in saltmine so cannot speak in forum atm, so I'll reply here I suppose. (just tried to use my own blog but got trapped in my own modqueue... ugh):

http://logs.ericbenevides.com/log/trilema/2019-12-02#1954028 - I mean, the id is automatically set on the table. My thought was you import dump into staging table, then insert into your main posts table from that.

As for the staggering of posts, the post_date on each record is staggered by about a minute. Could simply just update the post_date on each (putting them into the future) once it is in the staging table.

http://logs.ericbenevides.com/log/trilema/2019-12-02#1954031 - Hmm it *should* have the table width but I cannot check atm, so will take your word. It does not have the class="breakup"; I did not know this was required.

http://logs.ericbenevides.com/log/trilema/2019-12-02#1954033 - I can give you a dump like this, though again remember that post id is autoset on insert.

http://logs.ericbenevides.com/log/trilema/2019-12-02#1954034 - I'll be honest, I'm not sure what you mean here exactly. What are you escaping with sed?

In any case, I'll be back at my terminal tonight (6 hrs) once out of mines
9
Mircea Popescu
Monday, 2 December 2019

But I mean holy god, why the hell would you comment on a random log day ?! I've moved it to hang off of the intro post, forget about it.

> My thought was you import dump into staging table

What staging table ? I just wanna add your data to trilema, like I added teh historical logs.

> I did not know this was required.

It's required because otherwise very long strings with no spaces fuck up the formatting. Whole discussion is here.

> I can give you a dump like this, though again remember that post id is autoset on insert.

It's not dude, it is whatever you say it is ; if you don't say what it is it'll default to the next index count.

> What are you escaping with sed?

You can't have the following : "INSERT INTO x (a, b) VALUES ("title", "and then he said : "hurr!" and i laugheed");" even if "and then he said : "hurr!" and i laugheed" comes from some variable in code. You must instead say "INSERT INTO x (a, b) VALUES ("title", "and then he said : \\"hurr!\\" and i laugheed");" for our needs here, so all quotes in article content must be replaced by \\". I am also switching the literal * for the &ask; encoded entity so my command line interpreter doesn't hang on it.

You didn't really much read teh log conversion script I posted, did you.
10
Eric Benevides
Tuesday, 3 December 2019

@Mircea Popescu

> But I mean holy god, why the hell would you comment on a random log day

Phew, yeah sorry. I had a case of brain flatulence earlier today apparently.

> You didn't really much read teh log conversion script I posted, did you.

To my shame no, but I have read it now (along with the linked threads). I see where you are coming from.

Before I turn in for the night I will get you a line-by-line dump of INSERTS into both the posts and term_relationships table as you mentioned in the logs. This dump will also have the correct table style and td class values.

I'll also make sure the dump is properly escaped. (I was working under the impression that if we were both using teh mysqldump command to export/import then ~it~ should be doing this escaping, although after poking around the interwebs a bit it seems this is not always the case anyways..)
11
Mircea Popescu
Tuesday, 3 December 2019

Works ; Ima set it to brew overnight then, and ideally have a files+db package for you tomorrow or the next day. This'll be 1GB+ for the db and a buncha GB for the files. I do intend to give out the db in the standard mysqldump format then, you don't have any publishing constraints on your end and I imagine it'll be simpler this way.
12
Eric Benevides
Tuesday, 3 December 2019

Dump will be when I wake instead. I ended up having to make a few more fixes in addition to the planned fixes above. There were enough changes that I decided it was cleaner to simply re-run my archives-to-blog process overnight.

I had initially missed that your archives had monotonic line ids (I was using a timestamp in my version), so I went ahead and fixed the bot to output them. The dump I'll be giving you will start at line id 2067223, right where your archives left off. There are a few smaller formatting fixes I made that will also be included.
13
Mircea Popescu
Tuesday, 3 December 2019

Cool deal!

I vaguely remember now you asked about this back when and I naively cut the knot nodding along ; then sat down to do the early logs changed it and never even said anything. Sorry about that.
14
Eric Benevides
Tuesday, 3 December 2019

Alrighty, here are teh dumps:

tril_posts (with the sed escaping applied)
tril_posts_table_sed.sql
tril_posts_table_sed.sql.lobbes.sig

tril_term_relationships
tril_term_relationships.sql
tril_term_relationships.sql.lobbes.sig

tril_posts (without the sed applied. not sure if you'll need this but I did a test import with it with no issues; figured I'd provide it too)
tril_posts_table.sql
tril_posts_table.sql.lobbes.sig

I'll be around for most of today/tonight if there are any issues.
15
Mircea Popescu
Tuesday, 3 December 2019

Darn, why are they in separate files anyway ?

Understand, I don't import the dumps, I run mysql queries one by one, like described in comment #6 :

cat logs.txt | while read -r line; do echo $line | mysql --host=localhost --user=etc; sleep 57; done

Whatever, not so hard to splice two files together on this end. However... the tail is also broken :

href=\\"#2067925\\">23:59]','
#trilema logs for 28 Mar 2016',0,'','publish','open','open','','trilema-logs-for
-28-Mar-2016','','','2019-12-03 08:12:01','2019-12-03 08:12:01','',0,'http://blo
g.ericbenevides.com/?p=5261',0,'post','',0);

The guid should obviously match the id (ie, be simply "90800" in that quoted case).

Finally, I don't want the title to be "#trilema logs for 28 Mar 2016", I want it to be "Forum logs for 26 Mar 2016", I'm not quite ready to give Freenode as much weight as all that either in the direct or by implication ; let that indirection layer stand in favour of republican continuity rather than permit transparency to favour heathen pretense.

I ~could~ do all these, but esp the guid reparsing would be a pain in the ass, so would you mind re-restating the damned thing ?
16
Eric Benevides
Tuesday, 3 December 2019

> Darn, why are they in separate files anyway ?

This is due to my own limitation on how I'm exporting the things (via e.g. "mysqldump --complete-insert --no-create-info --extended-insert=False -u root -p database tril_posts > tril_posts_table.sql").

> Whatever, not so hard to splice two files together on this end.

Word.

> The guid should obviously match the id (ie, be simply "90800" in that quoted case).

Ah shit, I missed that one. I will fix. Just to make sure I understand, you want the guid == article-id (as opposed to http://www.siteurl.com/?p=article-id)?

> I want it to be "Forum logs for 26 Mar 2016", I'm not quite ready to give Freenode as much weight as all that either in the direct or by implication

This makes sense, and shouldn't be too difficult of a fix.

I will aim to get an updated dump out tonight.
17
Mircea Popescu
Wednesday, 4 December 2019

> Ah shit, I missed that one. I will fix. Just to make sure I understand, you want the guid == article-id (as opposed to http://www.siteurl.com/?p=article-id)?

Apparently the correct format is indeed "http://trilema.com/?p=49529" or such, so sure, go for that. (As it turns out I didn't do this on my own import, but w/e, the fix is one simple update statement away).
18
Eric Benevides
Wednesday, 4 December 2019

> Apparently the correct format is indeed "http://trilema.com/?p=49529" or such, so sure, go for that.

Done. Refreshed dump (with guid, post_title, and post_name updates) available at:

tril_posts (with sed escaping)
tril_posts_table_sed.sql
tril_posts_table_sed.sql.lobbes.sig

tril_posts (without the sed escaping applied)
tril_posts_table.sql
tril_posts_table.sql.lobbes.sig
19
Mircea Popescu
Wednesday, 4 December 2019

In other lulz,

--
-- Dumping data for table `tril_posts`
--

LOCK TABLES `tril_posts` WRITE;
/&ast;!40000 ALTER TABLE `tril_posts` DISABLE KEYS &ast;/;
INSERT INTO `tril_posts` (`ID`,[...]

Speaking of the guid error above (which meanwhile was fixed), let us record the fixing for the curious noob :

mysql> update tril_posts set guid = CONCAT("http://trilema.com/?p=", id) where post_status="Publish" and guid="";
Query OK, 1485 rows affected (0.62 sec)
Rows matched: 1485 Changed: 1485 Warnings: 0

In yet other lulz, when we get to the splicing :

$ wc -l posts.sql
1293 posts.sql
$ wc -l termr.sql
1293 termr.sql

which seems fine ; however posts.sql starts with "`comment_count`) VALUES (90800" and ends with "comment_count`) VALUES (92092" whereas term_relationships.sql starts with "term_order`) VALUES (90799" and ends with "term_order`) VALUES (92091". This'd be what's called an off-by-one error, aka fence post error, huh! You know the story of the man named Afanti, when he was taking ten donkeys to town ?

Anyway, to record that also for the curious noob :

paste posts.sql termr.sql > logs5.txt

:p

And now we're ready to try publishing the first of these, standby!
20
Mircea Popescu
Wednesday, 4 December 2019

Meanwhile in extra lulz,

grep -c "2019-12-03 22:03:06" logs7.txt
1293

So it's time for... logs8.txt I guess :

let timecnt=1575414633; cat logs7.txt | while read -r line; do timecnt=$((timecnt+57)); echo $line | sed "s%\x272019-12-03 22:03:06\x27,\x272019-12-03 22:03:06\x27%FROM_UNIXTIME($timecnt+7200), FROM_UNIXTIME($timecnt)%g" >> logs8.txt; done

Hopefully it's finally right nao.
21
Eric Benevides
Monday, 16 March 2020

Alrighty, here are the final historical dumps for the old Forum logs dating from 28 Mar 2016 to the end. I set all of the IDs/guids to be the same as last time (starting from 90800, and this time no fence post errors with the relationships table). The "post_date" is also increasing monotonically by 2 seconds this time around. Everything ought to be properly escaped as well. The only thing, obviously, I did not do was splice the files together.

tril_posts.sql
tril_posts.sql.sig

tril_term_relationships.sql
tril_term_relationships.sql.sig

Hopefully you can use these as-is, but if for some reason I still missed something let me know and I will fix it.
22
Mircea Popescu
Monday, 16 March 2020

Well, the self-obvious problem with starting with 90800, "like last time" is that last time was back in November and obviously Trilema has had new articles published since then. The latest currently is ?p=91534 though lessee here, mayhaps I can fix this muhself.
23
Eric Benevides
Tuesday, 17 March 2020

Well, the self-obvious problem with starting with 90800, "like last time" is that last time was back in November and obviously Trilema has had new articles published since then.

Yeah, sorry about that one. I realized at the 11th hour last night that "ah shit, I forgot to ask him what his current article count is", but I weighed waiting to figure that one thing out versus just giving you what I had now and I made the call for the latter
24
Mircea Popescu
Tuesday, 17 March 2020

Honestly it works just fine, you can cross it off the list as done.
25
Mircea Popescu
Tuesday, 17 March 2020

Eric, It came to my attention that my intended payment never did go through, owing to... well, basically, everything being broken everywhere, and thoroughly.

I can resolve the matter either by having a slavegirl type by hand the pad and verify with deedbot thus (which I don't intend to do, they've got very long worklists atm) ; or else by sending you a non-deedbot payment to an address of your choice (if you prefer) ; or else in the normal course of business, as I get whatever the fuck's wrong with that system teased out today/tomorrow.

Sorry for the inconvenience.
26
Eric Benevides
Wednesday, 18 March 2020

Mircea Popescu, nah no worries; this world does seem to become a little less functional each day.

Feel free to send it over to 17V5hUaYyUpuzyvLZbYtXMLEHKCFqoi4ad if that is easier. Otherwise I'm in no hurry or anything, so I'm fine to wait.
27
Mircea Popescu
Saturday, 21 March 2020

In other news, I've opened a project to replace outside links to forum log lines with internal references. It will proceed item by item, starting with btcbase.

This is more painful than it needs to be on one hand because the offset is not stable (for instance, in the match http://trilema.com/2020/forum-logs-for-29-jun-2019/#2541597 / http://btcbase.org/log/2019-06-29#1920548 the offset is 621049, whereas in the match http://trilema.com/2019/forum-logs-for-01-jul-2013#739950 / http://btcbase.org/log/?date=01-07-2013#113889 the offset is 626061) and on the other hand because my solipsistic pov wrt reconstituting the old logs has produced instances in which Trilema references log lines that aren't actually in my own logs (like for instance in the case of "cads: wow I just read mircea completely demolish some logic, and then make an unfounded claim. An anonymous kid attacked the claim, to which mircea finally admitted the claim was true because he was the elder and does not have to give an explanation and take it or leave it. then the kid actually kowtows and apologizes for being treated this way and even _thanks mircea for replying at all_" ie http://btcbase.org/log/?date=03-04-2014#601035 linked from http://trilema.com/2014/fred-quimby-and-ancient-evils/ but not available on either http://trilema.com/2019/forum-logs-for-03-apr-2014/ or http://trilema.com/2019/forum-logs-for-02-apr-2014/ ).

As to the former, the actual solution as deployed is to follow log references to the outside logger referenced ; isolate the line in question ; search for it on Trilema ; reconstruct the correct reference and replace it -- a complex, friable pile of string muntzing. As to the latter I know of no possible solution ; I can't be arsed to revisit the logs nor will I destroy outside links -- god knows they die naturally fast enough anyways.

So far a few hundred occurences have already been replaced ; there's a few thousand to go. (The selection fix is still on my list but that particular "technology" aka "smart" kids generated pain in the ass will have to wait for a later time.)
28
Diana Coman
Saturday, 21 March 2020

Each time I link to the log as it is, I wince because I'm adding to the pile of links that I'll probably have to change eventually and with the headaches that you mention indeed. Do you plan to publish the recipe for replacing the links from those external log-sites to the logs published as articles on the blog?
29
Mircea Popescu
Saturday, 21 March 2020

Not really. 'Tis not the republic anymore, you know ?
30
Diana Coman
Saturday, 21 March 2020

Hence my question, indeed. (And there's still a ton published and unused otherwise anyways, to add on top to the meh, I can see it.)
31
Mircea Popescu
Sunday, 22 March 2020

The exercise brought to fore all sorts of other breakage, for instance half-imported days like http://trilema.com/2019/forum-logs-for-01-may-2014/ ; broken display though the backend seems correct like in http://trilema.com/2019/forum-logs-for-04-may-2014/ ; outright inexplicably missing days like http://trilema.com/2019/forum-logs-for-05-may-2014/ and so on.

The problems of large natural datasets. Anyways, looks like I'll be fixing "this one little thing" for ~ever.
32
Eric Benevides
Friday, 1 May 2020

I was a bit late to check my deedbot wallet, but I want to confirm that I got your payment; ty!
33
Mircea Popescu
Friday, 1 May 2020

Cheers.
34
Mircea Popescu
Thursday, 28 May 2020

In other news, I've discovered (as part of handling obscure footnote references for today's Robocop review) that a handful of old logs were fragmentarily imported (because of inane bullshit). Here's the list :

SELECT id, post_name FROM tril_posts WHERE post_type ="post" AND post_content NOT LIKE "%</td></tr></table>" and post_name like "forum-logs%" order by id desc
+-------+----------------------------+
| id | post_name |
+-------+----------------------------+
| 90619 | forum-logs-for-14-jan-2016 |
| 90609 | forum-logs-for-04-jan-2016 |
| 90575 | forum-logs-for-01-dec-2015 |
| 90563 | forum-logs-for-19-nov-2015 |
| 90499 | forum-logs-for-15-sep-2015 |
| 90416 | forum-logs-for-24-jun-2015 |
| 90024 | forum-logs-for-01-may-2014 |
| 89618 | forum-logs-for-21-mar-2013 |
| 89606 | forum-logs-for-09-mar-2013 |
| 89475 | forum-logs-for-29-oct-2012 |
| 89204 | forum-logs-for-14-feb-2012 |
| 89193 | forum-logs-for-02-feb-2012 |
+-------+----------------------------+
12 rows in set (8.66 sec)

Needless to say these are now fixed. Sorry for any inconvenience.
35
Mircea Popescu
Saturday, 18 July 2020

As part of intensive sanity checks prompted by the above May discovery, I'm also accounting for log gaps. Here's how to find plot holes in mp-wp dbs :

select l.id +1 as start from mpwp_posts as l left outer join mpwp_posts as r on l.id +1 = r.id where r.id is null and l.id > 89087;
+-------+
| start |
+-------+
| 89293 |
| 89615 |
| 89680 |
| 89780 |
| 90005 |
| 90028 |
| 90232 |
| 90286 |
| 90459 |
| 90474 |
| 90482 |
| 90493 |
| 90570 |
| 90615 |
| 90622 |
+-------+
15 rows in set (0.02 sec)

I've meanwhile filled all these gaps ; the rate of progress through this entirely unnecessary swamp was greatly diminished by the misfortunate circumstance that each and every attempt (of which I've made a dozen or so in the interval) immediately results (really, entirely consists) of interacting with the flaming imbecile idiocy of "IT" withall its endless magic-strings-and-formats-conversions vomit, which makes me want to beat the stupid mothers of all these idle fucktards into a bloody pulp -- especially seeing how there was no real reason to "have problems" and "discover gaps" in the first fucking place.

Just look at this inane shit :

cat 'logs.txt' | grep "29-jan-2016" | sed 's/*/\*/g' | sed 's/@/@/g' | sed "s%'%\'%g" | tr '"' "'" | sed "s%\\\'%FUCKINGFORMATSFFS%g" | sed 's%FUCKINGFORMATSFFS%"%g' | sed 's%\\\\%\/\/%g' > missingline.sql

Then add that ctrl-w searches in nano but closes whatever GUI Word and so on and so forth forever, or rather for as long as patience lasts. I really hope you fuckwits all starve forgotten inside small concrete cubes.

Unrelated to the foregoing, there's nothing logged during August 3rd, 4th, 5th, 6th and 7th 2019, such that the entries for August 2nd and August 8th of that year follow each other directly. This is because those days were silent, a sort of full dress rehearsal for the March closure, immediately driven by Minsk shenanigans but otherwise the necessary continuation of long standing "problems", to call thusly well entrenched idiocy. People close to the center of the world probably recall their "flushing the toilet" comment in private conversation, and everyone else... well, everyone else is everyone else.

Hopefully with this last (significant!) chunk of my time and patience, the matter of that juvenile, naive attempt of mine can finally be laid to rest, because holy god enough already.
36
Federico Weinhold
Sunday, 19 July 2020

"Es gibt kein Morgen mehr danach
Wir sind die durch die Hölle gehen
Gott weint uns keine Träne nach."
37
Mircea Popescu
Sunday, 19 July 2020

Gen.

38
Notes on importing log archives into mp wp « Krankendenken (via Pingback)
Friday, 22 November 2019

[...] answer the question posed in http://trilema.com/2019/introducing-the-logs/#comment-132412 , here's how my thing1 imports data into the mpwp mysql tables. I figure I'll keep this to a brief [...]
39
Contribution Guidelines for TMSR OS « Dorion Mode (via Pingback)
Saturday, 14 December 2019

[...] months' Meanwhile the forum moved to #trilema and licensed castles while the log is available via trilema, ossasepia and ericbenevides. [...]
40
ejb review of Mar 9-Mar 15 ; plan for Mar 16-Mar 22 « Young Hands Club (via Pingback)
Tuesday, 17 March 2020

[...] I delivered the log history dump at least, but MP has confirmed that since he has left IRC behind him he has little need for a [...]

Add your cents! »

If this is your first comment, it will wait to be approved. This usually takes a few hours. Subsequent comments are not delayed.

Introducing the logs

Recent Comments

Recent Articles

Pages

Say NO! to platforms