Introducing the logs

Thursday, 21 November, Year 11 d.Tr. | Author: Mircea Popescu

As you might perhaps remember, a few weeks ago I commissioned Eric Benevides to produce a sort of meld, between Stanislav Datskovskiy's ad-hoc python-based IRC logger bot he made for #trilema the irc channel during the previous crisis and the MP-WP LAMP-based web publishing mechanism supporting Trilema the blog. As I take it delivery is imminent, so there's some preparatory changes needed here, chief among which this new Logs category. It will contain, as you might expect, one new article each day, reflecting that day's ongoing going ons in #trilema, the IRC channel. So there we go, that's all good & introduced, hurray for the A-team.

The other item needing an introduction is the IRC archive dump. As promised I sat down to splice together a complete and definitive version of the IRC logs ; the result is a body made of parts.

The first part comes from Freenode-#bitcoin-otc.log, a 15376633 byte file bearing a last-mod timestamp of Tue 22 Jul 2014 05:20:43 PM CST in my archives. While it's a historical fact that my public involvement with Bitcoin starts mid-2011, it is also a historical fact that I did not deem the whole IRC milieu important enough to keep archives until the beginning of 2012. If there exist earlier logs, I will make no effort to establish their authenticity, due to a belief that such efforts can not be successful to a degree of permitting detrimental reliance on their form, in which case what the hell's the point.

These early proto-logs run daily (with interruptions, logging was not intended to work in my absence in those early daysi) until August 11th 2012. Exactly two fragments have been elided for timeline conflict with the flow of historyii. The first oneiii came because I don't now remember who argued importantly in the manner of Internet dweebs that the reason I wasn't on #bitcoin-otc was really that I had been banned, and not personal preference, so I went and tested his theory. It turned out I hadn't been banned and the anon tard returned to the background bogonic radiation, no doubt making strong claims with no basis still, to this very day. The secondiv was part and parcel of the ample yet allegedly absentv body of evidence as to the identity of the party on top. Ever heard the self-obvious observation whereby democracy is a dubious social form owing to the happenstance that nine man gangrapes are enjoyable to nine tenths of the participants, while the tenth's underage anyway ? No, making men illegal won't change this, it'll just wreck the "rule of law" fiction (not that it's rescuable, owing to the cognitive death of the contract).

A portion was also introduced (from #bitcoin-otc-euvi), to cover the interval between April 8th and May 11th 2012vii. Throughout the Summer of 2012 committment remained weak on my part for any IRC channel in general speaking, so it's ambiguously unclear what relevancy either of these two could possibly claim for themselves, or whether the timeline should follow #bitcoin-otc.log or #bitcoin-otc-eu.log until August 10th 2012. They both fell off a cliff afterwards anyways, and the exacting attempt at a reconstruction of something out of disparate parts of nothing is too much like everyday work for the mind to relish low payoff applications upon historical detritus.

The last part comes from my copy of the bitcoin-assets.log as I handed it off to phf for publishing, and so that'll be all.

In the end this timeline comes to a little over two million seventy thousand lines accreted during the 1538 days between 12 January 2012 and 28 March 2016 ; a grand total of 160 or so Megabytes. People were a lot chattier then, but for little reason and even less benefit -- I indeed doubt there's enough material of actual substance in all that to produce a coupla thousand lines fit for inclusion in contemporaneous logs. Nevertheless, for historicity's sake the whole bundle's going to appear deluvionally right here, on irony's last remaining bastion, just as soon as I'm done formatting it for sql and importing the whole pile. You're... welcome, I guess.

———
  1. In fact, due to the nefarious influence of morons-in-aggregate, we didn't even necessarily think public logs are needed or important back then ; this is still the situation in that first sad remnant of my natural development.

    There are two morals in this footnote. One is that idiots in organized groups are universally evil in a moral sense and occasionally effectual at being evil in an ethical sense, thus therefore never permit morons to live lest they gather up and ruin the world while you're not looking -- you can't rely on yourself to know when, in some obscure, seemingly at the time unimportant angle of your life their idiocy is still as idiotic as it ever is, but not particularly obvious to you. The other is that the shells I've left behind, through overgrowing them, are numerous indeed -- just as you'd not have likely thought "oh, my personal experience's not that important -- #bitcoin-otc happened before" don't think "oh, #bitcoin-otc is how it all begun". It's not, I overgrow things, starting from childhood, as part and parcel of what their failure to keep up with me means. []

  2. Which is to say this channel, that had once been the forum, meanwhile a ruined fane, is only capable of yielding occasional footnotes. []

  3. **** BEGIN LOGGING AT Sat Oct 13 20:40:32 2012

    Oct 13 20:40:32 * Now talking on #bitcoin-otc
    Oct 13 20:40:32 * Topic for #bitcoin-otc is: OTC marketplace for Bitcoin trading and exchange. | http://bitcoin-otc.com/ | http://bit.ly/x56Fd5 | Start with the ;;guide | Before trading, talk to people, and check ratings. | Beware the FRAUDSTERS. ;;fraud | Trade Options: http://polimedia.us/btc | BTC → MP @ http://btcpak.com | Logs: http://bit.ly/NFg1Yy | Support bitcoin-otc! 1F1dPZxdxVVigpGdsafnZ3cFBdMGDADFDe
    Oct 13 20:40:32 * Topic for #bitcoin-otc set by nanotube at Tue Jul 31 07:10:19 2012
    Oct 13 20:40:32 -ChanServ- [#bitcoin-otc] Welcome! If you're new here, please read the channel rules: http://wiki.bitcoin-otc.com/wiki/Bitcoin-otc_channel_guidelines as well as the OTC systems user guide: http://wiki.bitcoin-otc.com/wiki/Using_bitcoin-otc
    **** ENDING LOGGING AT Sat Oct 13 20:40:35 2012

    []


  4. **** BEGIN LOGGING AT Wed Jul 23 02:20:27 2014

    Jul 23 02:20:27 * Now talking on #bitcoin-otc
    Jul 23 02:20:27 * Topic for #bitcoin-otc is: OTC marketplace for Bitcoin trading and exchange. | http://bitcoin-otc.com/ | http://bit.ly/x56Fd5 | Start with the ;;guide | Before trading, talk to people, and check ratings. | Beware the FRAUDSTERS ;;fraud | VPS for BTC @ vertvps.com code SUMMEROFOTC for 50% off first invoice | Logs: http://bit.ly/NFg1Yy | Support bitcoin-otc! 1F1dPZxdxVVigpGdsafnZ3cFBdMGDADFDe
    Jul 23 02:20:27 * Topic for #bitcoin-otc set by nanotube at Tue Jun 10 00:39:32 2014
    Jul 23 02:20:27 * [freenode-info] channel flooding and no channel staff around to help? Please check with freenode support: http://freenode.net/faq.shtml#gettinghelp
    Jul 23 02:20:27 -ChanServ- [#bitcoin-otc] Welcome! If you're new here, please read the channel rules: http://wiki.bitcoin-otc.com/wiki/Bitcoin-otc_channel_guidelines as well as the OTC systems user guide: http://wiki.bitcoin-otc.com/wiki/Using_bitcoin-otc
    Jul 23 02:20:27 * gribble gives voice to mircea_popescu
    Jul 23 02:20:31 mircea_popescu ;;view 21136
    Jul 23 02:20:33 * rchasman has quit (Client Quit)
    Jul 23 02:20:34 gribble #21136 Tue Jul 22 19:19:11 2014 mircea_popescu SELL 1000000.0 Doge coins deliverable 3 or 6 months @ 3 BTC (Or less, negotiable by volume and interval. Looking to carry a few billion.)
    Jul 23 02:20:35 mircea_popescu ;;view 21135
    Jul 23 02:20:35 Ex0deus ;;tlast
    Jul 23 02:20:35 gribble 620.0
    Jul 23 02:20:38 gribble #21135 Tue Jul 22 18:54:52 2014 mircea_popescu SELL 5000.0 ethereum coins deliverable March 15th, 2015 @ 1 BTC (Up to 1k BTC's worth accepted. Get in touch.)
    Jul 23 02:20:41 mircea_popescu there we go.
    **** ENDING LOGGING AT Wed Jul 23 02:20:44 2014

    []

  5. Yet to quote some random anodyne cuck, "the mystique is stronger than ever". []
  6. The move to #bitcoin-assets followed from here, kakobrekla owned the bitcoin-otc subordinate namespace -eu, and I just proposed we make a proper one. []
  7. The only part spliced out from the beginning of the #bitcoin-otc-eu log is this fragment from April 2nd :

    **** BEGIN LOGGING AT Mon Apr 2 01:58:08 2012

    Apr 02 01:58:08 * Now talking on #bitcoin-otc-eu
    Apr 02 01:58:08 * Topic for #bitcoin-otc-eu is: Eurozone #bitcoin-otc || http://bitcoin-otc.com || Include hash tag #eu in order notes to group -eu orders. || View all tagged -eu orders here: http://bitcoin-otc.com/vieworderbook.php?notes=%23eu || Exchange rates: !bc,convert CURRENCYCODE || GET BTC WITH ukash/paysafecard ... /msg neliskybot help
    Apr 02 01:58:08 * Topic for #bitcoin-otc-eu set by kakobrekla!~T42@89-212-41-49.static.t-2.net at Sat Jan 28 15:22:07 2012
    Apr 02 01:59:45 mircea_popescu ;;rate gmaxwell -1 I don't trust him.
    Apr 02 01:59:46 gribble Rating entry successful. Your rating of -1 for user gmaxwell has been recorded.
    Apr 02 02:52:26 * allied has quit (Ping timeout: 252 seconds)
    Apr 02 02:55:03 * allied (~allied2@213.229.88.72) has joined #bitcoin-otc-eu
    **** ENDING LOGGING AT Mon Apr 2 03:02:35 2012

    Yes, that's right, back then ratings didn't take any crypto verification, the whole system relied on Freenode's namespace control.

    Actually, in the cesspool left behind, it... still does. That's right, to this very day, the pretenders to relevancy pretend to relevancy, and just as idly as ever before. []

Category: Logs
Comments feed : RSS 2.0. Leave your own comment below, or send a trackback.

21 Responses

  1. Mircea Popescu`s avatar
    1
    Mircea Popescu 
    Thursday, 21 November 2019

    The xchat log format is maddeningly idiotic. Consider this line :

    Jan 12 03:27:11 <pigeons> yay freddieisdead!

    The format is : month SPACE day SPACE hour COLUMN minute COLUMN second SPACE< name >TAB text. How the fuck's anyone supposed to extract anything from this idiocy ? Especially considering space's fucking ambiguously also the most common character in text (and all the others also occur, so you can't even awk -F " |:" because then you've lost some data) ?! Pshaw.

    Oh, and of course, let's not forget -- use the universal BRE/ERE/PCRE globbing character as a MULTIPLE INSTANCE tag, it'll be such a pleasure to grep for the literal string ****.

    Also worth a mention, the deeply deranged handling of categories on MP-WP, which makes it preferable to dump everyting into the default "Zilnic" and fix it later, rather than harangue immense complexity into shape as we go.

    And, of course, we can't just use printf -- it's a bitch about "awk: (FILENAME=- FNR=236) fatal: not enough arguments to satisfy format string ^ ran out for this one" shenanigans ; so we're stuck with dumbass print, which sticks an ORS after each invocation, so we're stuck nulling ORS and so on and so forth for fucking ever. Why is the format for setting the IRS so different for the format setting the ORS ? Nooobody knowsss...

    Anyways, here's most of the way in bash :

    cat logstory.txt | grep -v '\*\*\*\* BEGI' | grep -v '\*\*\*\* ENDI' | grep . | awk -F " " -v ORS="" '{sqltemplate="\");\nINSERT INTO tril_posts(post_author, post_status, post_type, post_date, post_date_gmt, post_title, post_content) VALUES (1, \"publish\", \"future\",\""; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time= substr($3,0,5); ; sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){print sqltemplate}; day=$2; print "<tr><td>"$4"</td><td>"text"</td><td><font color=gray>["time"]</font></td></tr>"}' > logs.txt

    It outputs correct sql, as far as I know -- except for the part where it's a 548 character bash pipe monstrosity, thus well past any possible maintainability threshold, and it ~still~ doesn't do the date part. I want each of the ~1500 entries to be incremented by time, such that they end up published once a minute or something ; but theres no native "add 50 seconds to this timestamp" in awk, and if I end up writing one might as well make this a proper fucking program already.

  2. Ugh, that does indeed seem annoying.

    Unfortunately, the archive of logs I'm importing came from an extract of the data in the Stan logger (which only goes back to 2016).

    As for how the archives got into the Stan logger in the first place, Diana Coman wrote a converter for irssi logs, and I wrote one for znc logs.

    All this to say I have not yet wrestled with xchat log formats, though I'm fine to take a poke at it. Data conversion never ends...

    As for the maddening process of assigning the category, fwiw in the mp-wp bot code I indeed ended up doing this as a multi-step thing. Here's the relevant snippet from the vpatch if it helps you any. Essentially insert record into posts table, grab post_id, then do the dance with the various relationship tables.

  3. Mircea Popescu`s avatar
    3
    Mircea Popescu 
    Thursday, 21 November 2019

    > Unfortunately, the archive of logs I'm importing came from an extract of the data in the Stan logger

    Nevermind that part. How does it do the data to sql part ?

  4. Right now I have it coded in python. It reads a formatted dump line by line comparing the timestamp on current line to previous line. If the current line is a new day, it creates a new article, otherwise it just updates the article for that day (and adds the new line to the data in the post_contents field).

    It also applies all of the various functions such as adding the timestamp delimiters, etc.

    I'm at 'saltmine' atm and commenting from dumbphone, but should be back at my terminal in ~4 hours. I will ping you in chan then and can provide various specifics as you require 'em (or obv. just ask here and I'll provide once back).

  5. Mircea Popescu`s avatar
    5
    Mircea Popescu 
    Thursday, 21 November 2019

    Thanks ; if I'm asleep by then we just sync tomorrow or during the weekend.

  6. Mircea Popescu`s avatar
    6
    Mircea Popescu 
    Saturday, 23 November 2019

    In other lulz of the united open front : %s "obviously" means unixtime in date (and it obviously has to come in the format +"%s" because duh) while it just as "obviously" means command-line supplied parameter in awk. Because why not, right, nothing's more obvious than the obvious stupidity of obviousness.

    Anyways, it occurs to the suffering agent that if awk dun do time, mysql does, so just let it. Add to that various fixes, trims and whatnot (especially ux refinements provided directly through teh everholy comment section), and the definitive script as actually used turns out to :

    cat logstory.txt | grep -v '\*\*\*\* BEGI' | grep -v '\*\*\*\* ENDI' | grep . | sed 's%"%\\"%g' | awk -F " " -v ORS="" -v timecnt=1574519743 -v curryear=2012 -v articleid=89175 '{refcnt++;if($1=="Jan" && month=="Dec"){curryear++};title="Forum logs for "$2" "$1" "curryear;slug=tolower(title);gsub(" ","-",slug);sqltemplate="</table>\"); INSERT INTO tril_term_relationships (object_id, term_taxonomy_id) VALUES ("articleid-1", 44);\nINSERT INTO tril_posts(id, post_author, post_status, post_type, post_name, post_date, post_date_gmt, post_title, post_content) VALUES ("articleid",1, \"publish\", \"post\",\""slug"\", FROM_UNIXTIME("timecnt+7200"), FROM_UNIXTIME("timecnt"),\""title"\",\"<table style=\\\"font-size:1em;\\\">"; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time=substr($3,0,5); sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){timecnt=timecnt+57;print sqltemplate;articleid++;};print "<tr><td>"$4"</td><td>"; if($4=="*"){print "<span style=\\\"color:#d3d3d3;\\\">"};print text;if($4=="*"){print "</span>"};print"</td><td><span style=\\\"color:#d3d3d3;\\\">[<a id=\\\""refcnt"\\\" href=\\\"http://trilema.com/2019/"slug"#"refcnt"\\\">"time"</a>]</span></td></tr>";day=$2;month=$1;}' > logs.txt

    Ain't that a great way to spend 1233 characters!

    I did spend a while mired in a whole pile of

    a:3:{i:1574522448;a:1:{s:19:"publish_future_post";a:1:{s:32:"9cf6394663d7edb1f3bc55e81178f572";a:2:{s:8:"schedule";b:0;s:4:"args";a:1:{i:0;s:5:"89104";}}}}i:1574526765;a:2:{s:17:"wp_update_plugins";a:1:{s:32:"40cd750bba9870f18aada2478b24840a";a:3:{s:8:"schedule";s:10:"twicedaily";s:4:"args";a:0:{}s:8:"interval";i:43200;}}s:16:"wp_update_themes";a:1:{s:32:"40cd750bba9870f18aada2478b24840a";a:3:{s:8:"schedule";s:10:"twicedaily";s:4:"args";a:0:{}s:8:"interval";i:43200;}}}s:7:"version";i:2;}

    (from the options table) trying to get the mpwp native publishing scheduler work, but in the end it does not. It's a gnarly format, that comes to a (for array) column 3 (element count). i denotes an integer, s denotes a string, so s:6:"string" would be how you say string. The hash in the example above is actually md5 (yes, lol), serving in the role of poor man's gensym. Yet even with all that, there's another missing piece somewhere, because merely inserting a formed array into the options does not trigger the future publishing whereas producing a future article through the interface and then switching out its id here for your "missed deadline" article WILL publish the sql-inserted article by that new id rather than the interface-produced article. (Also, simply wiping the "twicedaily" wp_update_plugins/wp_update_themes bullcrap results in this php reimplementation of cron used by mpwp simply reinterpreting your hook as a subhook, in a reconstructed hook structure. It'd prolly be good if the driver of this behaviour is found and eviscerated, but I'm guessing it'll have to wait for billymg getting round to it.)

    In any case, I gave up trying to get MPWPs future function to do this, instead I just used a sleep in bash :

    cat logs.txt | while read -r line; do echo $line | mysql --host=localhost --user=etc; sleep 57; done

    And that was the end of it. Expect normal service to be restored tomorrow, go hunt or something for a day. Ta-da!

    PS. I confess it's pleasant to see things like

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    18076 * 20 0 103m 1248 1000 R 98.0 0.0 0:43.55 awk

    you know ?

  7. Mircea Popescu`s avatar
    7
    Mircea Popescu 
    Sunday, 24 November 2019

    Well, the above quoted might've been the script "actually used" to... start the publishing with. The script publishing ended with, however, is slightly different :

    cat logstory.txt | grep -v "^\*" | grep -v "^ " | grep . | sed 's%"%\\"%g' | sed 's%*%\&ast;%g' | awk -F " " -v ORS="" -v timecnt=1574535743 -v curryear=2012 -v articleid=89188 '{refcnt++;if($1=="Jan" && month=="Dec"){curryear++};title="Forum logs for "$2" "$1" "curryear;slug=tolower(title);gsub(" ","-",slug);sqltemplate="</table>\"); INSERT INTO tril_term_relationships (object_id, term_taxonomy_id) VALUES ("articleid-1", 44);\nINSERT INTO tril_posts(id, post_author, post_status, post_type, post_name, post_date, post_date_gmt, post_title, post_content) VALUES ("articleid",1, \"publish\", \"post\",\""slug"\", FROM_UNIXTIME("timecnt+7200"), FROM_UNIXTIME("timecnt"),\""title"\",\"<table style=\\\"font-size:1em;width:580px;\\\">"; text="" ; for(i=5;i<=NF;i++){text=text$i" "}; time=substr($3,0,5); sub(/</, "#", $4); sub(/>/, "</b>", $4); sub(/#/, "<b>", $4); if($2 != day){timecnt=timecnt+57;print sqltemplate;articleid++;};print "<tr><td>"$4"</td><td class=\\\"breakup\\\">"; if($4=="*"){print "<span style=\\\"color:#d3d3d3;\\\">"};print text;if($4=="*"){print "</span>"};print"</td><td><span style=\\\"color:#d3d3d3;\\\">[<a id=\\\""refcnt"\\\" href=\\\"http://trilema.com/2019/"slug"#"refcnt"\\\">"time"</a>]</span></td></tr>";day=$2;month=$1;}' > logs4.txt

    Particularly valuable in there (other than the implicit major version count) is the magic bullet for the nasty overflow issue (no, the first fix didn't fix it all -- there's a different driver for the observed misbehaviour in very long no-space strings such are common in, well, a crypto context). It turns out the following peculiar gymnastics resolve it : setting the table to an explicit width (58o px in the case of Trilema); setting the middle column to "breakup" class; and finally adding a

    .breakup { word-break: break-all;}

    line to the style.css for the theme.

    PS. What's lovelier than typing ctrl-6 alt-\ ctrl-k into nano within a second and change, only for it to eat a minute and over to execute the orders. I really love overwhelming machines!!!

    PPS. Apparently sed 's%*%&ast;%g' does not replace each instance of the * character with the &ast; htmlentity -- because why the fuck would it! Isn't this after all the #1 reason everyone loves the sed replace format, that you can use whatever characters you wish for the separators, and therefore don't have to worry so much about escaping the content ? Fucking 'ell.

  8. I want to address this morning's Qs but am in saltmine so cannot speak in forum atm, so I'll reply here I suppose. (just tried to use my own blog but got trapped in my own modqueue... ugh):

    http://logs.ericbenevides.com/log/trilema/2019-12-02#1954028 - I mean, the id is automatically set on the table. My thought was you import dump into staging table, then insert into your main posts table from that.

    As for the staggering of posts, the post_date on each record is staggered by about a minute. Could simply just update the post_date on each (putting them into the future) once it is in the staging table.

    http://logs.ericbenevides.com/log/trilema/2019-12-02#1954031 - Hmm it *should* have the table width but I cannot check atm, so will take your word. It does not have the class="breakup"; I did not know this was required.

    http://logs.ericbenevides.com/log/trilema/2019-12-02#1954033 - I can give you a dump like this, though again remember that post id is autoset on insert.

    http://logs.ericbenevides.com/log/trilema/2019-12-02#1954034 - I'll be honest, I'm not sure what you mean here exactly. What are you escaping with sed?

    In any case, I'll be back at my terminal tonight (6 hrs) once out of mines

  9. Mircea Popescu`s avatar
    9
    Mircea Popescu 
    Monday, 2 December 2019

    But I mean holy god, why the hell would you comment on a random log day ?! I've moved it to hang off of the intro post, forget about it.

    > My thought was you import dump into staging table

    What staging table ? I just wanna add your data to trilema, like I added teh historical logs.

    > I did not know this was required.

    It's required because otherwise very long strings with no spaces fuck up the formatting. Whole discussion is here.

    > I can give you a dump like this, though again remember that post id is autoset on insert.

    It's not dude, it is whatever you say it is ; if you don't say what it is it'll default to the next index count.

    > What are you escaping with sed?

    You can't have the following : "INSERT INTO x (a, b) VALUES ("title", "and then he said : "hurr!" and i laugheed");" even if "and then he said : "hurr!" and i laugheed" comes from some variable in code. You must instead say "INSERT INTO x (a, b) VALUES ("title", "and then he said : \\"hurr!\\" and i laugheed");" for our needs here, so all quotes in article content must be replaced by \\". I am also switching the literal * for the &ask; encoded entity so my command line interpreter doesn't hang on it.

    You didn't really much read teh log conversion script I posted, did you.

  10. @Mircea Popescu

    > But I mean holy god, why the hell would you comment on a random log day

    Phew, yeah sorry. I had a case of brain flatulence earlier today apparently.

    > You didn't really much read teh log conversion script I posted, did you.

    To my shame no, but I have read it now (along with the linked threads). I see where you are coming from.

    Before I turn in for the night I will get you a line-by-line dump of INSERTS into both the posts and term_relationships table as you mentioned in the logs. This dump will also have the correct table style and td class values.

    I'll also make sure the dump is properly escaped. (I was working under the impression that if we were both using teh mysqldump command to export/import then ~it~ should be doing this escaping, although after poking around the interwebs a bit it seems this is not always the case anyways..)

  11. Mircea Popescu`s avatar
    11
    Mircea Popescu 
    Tuesday, 3 December 2019

    Works ; Ima set it to brew overnight then, and ideally have a files+db package for you tomorrow or the next day. This'll be 1GB+ for the db and a buncha GB for the files. I do intend to give out the db in the standard mysqldump format then, you don't have any publishing constraints on your end and I imagine it'll be simpler this way.

  12. Dump will be when I wake instead. I ended up having to make a few more fixes in addition to the planned fixes above. There were enough changes that I decided it was cleaner to simply re-run my archives-to-blog process overnight.

    I had initially missed that your archives had monotonic line ids (I was using a timestamp in my version), so I went ahead and fixed the bot to output them. The dump I'll be giving you will start at line id 2067223, right where your archives left off. There are a few smaller formatting fixes I made that will also be included.

  13. Mircea Popescu`s avatar
    13
    Mircea Popescu 
    Tuesday, 3 December 2019

    Cool deal!

    I vaguely remember now you asked about this back when and I naively cut the knot nodding along ; then sat down to do the early logs changed it and never even said anything. Sorry about that.

  14. Alrighty, here are teh dumps:

    tril_posts (with the sed escaping applied)
    tril_posts_table_sed.sql
    tril_posts_table_sed.sql.lobbes.sig

    tril_term_relationships
    tril_term_relationships.sql
    tril_term_relationships.sql.lobbes.sig

    tril_posts (without the sed applied. not sure if you'll need this but I did a test import with it with no issues; figured I'd provide it too)
    tril_posts_table.sql
    tril_posts_table.sql.lobbes.sig

    I'll be around for most of today/tonight if there are any issues.

  15. Mircea Popescu`s avatar
    15
    Mircea Popescu 
    Tuesday, 3 December 2019

    Darn, why are they in separate files anyway ?

    Understand, I don't import the dumps, I run mysql queries one by one, like described in comment #6 :

    cat logs.txt | while read -r line; do echo $line | mysql --host=localhost --user=etc; sleep 57; done

    Whatever, not so hard to splice two files together on this end. However... the tail is also broken :

    href=\\"#2067925\\">23:59]','
    #trilema logs for 28 Mar 2016',0,'','publish','open','open','','trilema-logs-for
    -28-Mar-2016','','','2019-12-03 08:12:01','2019-12-03 08:12:01','',0,'http://blo
    g.ericbenevides.com/?p=5261
    ',0,'post','',0);

    The guid should obviously match the id (ie, be simply "90800" in that quoted case).

    Finally, I don't want the title to be "#trilema logs for 28 Mar 2016", I want it to be "Forum logs for 26 Mar 2016", I'm not quite ready to give Freenode as much weight as all that either in the direct or by implication ; let that indirection layer stand in favour of republican continuity rather than permit transparency to favour heathen pretense.

    I ~could~ do all these, but esp the guid reparsing would be a pain in the ass, so would you mind re-restating the damned thing ?

  16. > Darn, why are they in separate files anyway ?

    This is due to my own limitation on how I'm exporting the things (via e.g. "mysqldump --complete-insert --no-create-info --extended-insert=False -u root -p database tril_posts > tril_posts_table.sql").

    > Whatever, not so hard to splice two files together on this end.

    Word.

    > The guid should obviously match the id (ie, be simply "90800" in that quoted case).

    Ah shit, I missed that one. I will fix. Just to make sure I understand, you want the guid == article-id (as opposed to http://www.siteurl.com/?p=article-id)?

    > I want it to be "Forum logs for 26 Mar 2016", I'm not quite ready to give Freenode as much weight as all that either in the direct or by implication

    This makes sense, and shouldn't be too difficult of a fix.

    I will aim to get an updated dump out tonight.

  17. Mircea Popescu`s avatar
    17
    Mircea Popescu 
    Wednesday, 4 December 2019

    > Ah shit, I missed that one. I will fix. Just to make sure I understand, you want the guid == article-id (as opposed to http://www.siteurl.com/?p=article-id)?

    Apparently the correct format is indeed "http://trilema.com/?p=49529" or such, so sure, go for that. (As it turns out I didn't do this on my own import, but w/e, the fix is one simple update statement away).

  18. > Apparently the correct format is indeed "http://trilema.com/?p=49529" or such, so sure, go for that.

    Done. Refreshed dump (with guid, post_title, and post_name updates) available at:

    tril_posts (with sed escaping)
    tril_posts_table_sed.sql
    tril_posts_table_sed.sql.lobbes.sig

    tril_posts (without the sed escaping applied)
    tril_posts_table.sql
    tril_posts_table.sql.lobbes.sig

  19. Mircea Popescu`s avatar
    19
    Mircea Popescu 
    Wednesday, 4 December 2019

    In other lulz,

    --
    -- Dumping data for table `tril_posts`
    --

    LOCK TABLES `tril_posts` WRITE;
    /&ast;!40000 ALTER TABLE `tril_posts` DISABLE KEYS &ast;/;
    INSERT INTO `tril_posts` (`ID`,[...]

    Speaking of the guid error above (which meanwhile was fixed), let us record the fixing for the curious noob :

    mysql> update tril_posts set guid = CONCAT("http://trilema.com/?p=", id) where post_status="Publish" and guid="";
    Query OK, 1485 rows affected (0.62 sec)
    Rows matched: 1485 Changed: 1485 Warnings: 0

    In yet other lulz, when we get to the splicing :

    $ wc -l posts.sql
    1293 posts.sql
    $ wc -l termr.sql
    1293 termr.sql

    which seems fine ; however posts.sql starts with "`comment_count`) VALUES (90800" and ends with "comment_count`) VALUES (92092" whereas term_relationships.sql starts with "term_order`) VALUES (90799" and ends with "term_order`) VALUES (92091". This'd be what's called an off-by-one error, aka fence post error, huh! You know the story of the man named Afanti, when he was taking ten donkeys to town ?

    Anyway, to record that also for the curious noob :

    paste posts.sql termr.sql > logs5.txt

    :p

    And now we're ready to try publishing the first of these, standby!

  20. Mircea Popescu`s avatar
    20
    Mircea Popescu 
    Wednesday, 4 December 2019

    Meanwhile in extra lulz,

    grep -c "2019-12-03 22:03:06" logs7.txt
    1293

    So it's time for... logs8.txt I guess :

    let timecnt=1575414633; cat logs7.txt | while read -r line; do timecnt=$((timecnt+57)); echo $line | sed "s%\x272019-12-03 22:03:06\x27,\x272019-12-03 22:03:06\x27%FROM_UNIXTIME($timecnt+7200), FROM_UNIXTIME($timecnt)%g" >> logs8.txt; done

    Hopefully it's finally right nao.

  1. [...] answer the question posed in http://trilema.com/2019/introducing-the-logs/#comment-132412 , here's how my thing1 imports data into the mpwp mysql tables. I figure I'll keep this to a brief [...]

Add your cents! »
    If this is your first comment, it will wait to be approved. This usually takes a few hours. Subsequent comments are not delayed.