tiltles synching, content is not, and URLs are incorrect. #2

luca-git · 2025-01-01T14:00:50Z

Hi, thanks for building this, it’s badly needed. I’ve been able to build in linux and windows, but it syncs only the titles, no content and wrong URLs, is it working properly for you? any setting hint you’d mind sharing?

GuentherE · 2025-01-01T17:06:48Z

same here. Titles are fetched, but links are invalid: e. g. site:: [www.reddit.com]()

should be either:
site:: [Updated to latest release, and this is all I get now on local machine or remote](https://www.reddit.com/r/immich/comments/1hfk9ai/updated_to_latest_release_and_this_is_all_i_get/)
or
url:: https://www.reddit.com/r/immich/comments/1hfk9ai/updated_to_latest_release_and_this_is_all_i_get/

date saved is also empty: date-saved:: [[]]

GuentherE · 2025-01-01T19:54:24Z

There is also another tool for logseq: https://codeberg.org/strubbl/wallabag-logseq

It is just a command line tool and not a plugin, but it imports all archived articles into a single logseq page.

Maybe someone can reuse some code to fix our import problem ...?

hnykda · 2025-01-02T21:08:10Z

Thanks for reporting! I will take a look at it.

There is also another tool for logseq

I did find that tool before trying to implement this one, but I couldn't see how it would help. It's written in Go and is a CLI, not a javascript-like logseq plugin.

hnykda · 2025-01-02T21:15:20Z

@luca-git did you have syncContent option set to true? Without that the content is not synced.

hnykda · 2025-01-02T22:38:03Z

I pushed a new version. It removes some customizable templating and separate-pages-sync for now and hardcodes the template in a way that it can always be rendered. Can you try it for me, please?

luca-git · 2025-01-03T21:52:30Z

Jep it was properly checked. I'll try this in the weekend :)

luca-git · 2025-01-05T00:34:41Z

testing it, synchyng since today at 1pm , 12.5 hours! it used to take 3 hourt to synch only titles, so kinda expected but too much IMO. I have only 900 articles so definitely an issue, if tomorrow morning it's finished i'll follow up, otherwise i'll have to find a way to setup Omnivore somewhere :(

hnykda · 2025-01-05T09:38:05Z

You do you, but from my limited eyeball testing it seemed that the bottleneck was logseq, not wallabag/omnivore (your wallabag server could be of course very slow if you run it on potato, but I would be somewhat surprised). I think it could be sped up by e.g. increasing the batch size (currently at 30) to e.g. 100, maybe that would speed things up, but I don't think you would notice. I am adding a proper release so you don't have to use this unpacked version - maybe it will be faster when build for production?

I believe this is the reason for the push towards logseq DB version (that has been in development for more than a year), so it's faster for bigger databases/ingestions.

find a way to setup Omnivore somewhere

Btw. that was my first go to strategy. But after trying it, I decided to go with wallabag instead.

luca-git · 2025-01-05T12:42:21Z

Thaknks for your work, I had to stop the process now after 24 hours. I'm not an expert of LogSeq by no means so I'm sure you are right, I believe synching from omnivore was way faster though, but can't say why. I'll try the release when available. (BTW, after force stopping it the created page, presumably partial, didn't have contents as the previous one as well). I'll do y test on ubuntu next. Who knows.

luca-git · 2025-01-05T12:59:37Z

Where is this new version getting the configs? i see my paswords and secretsare still there in the built version you just released, amybe it's remebering some wrong settings as well?

{
  "generalSettings": "",
  "wallabagUrl": "redacted",
  "clientId": "redacted",
  "clientSecret": "redacted",
  "userLogin": "redacted",
  "userPassword": "redacted",
  "syncContent": true,
  "frequency": 60,
  "syncAt": "2025-01-01T21:27:22",
  "graph": "personali",
  "pageName": "Wallabag",
  "disabled": true,
  "filter": "import all my articles",
  "customQuery": "",
  "highlightOrder": "the time that highlights are updated",
  "isSinglePage": true,
  "createTemplate": "",
  "createTemplateDesc": "",
  "articleTemplate": "- [{{{title}}}]({{{url}}})\n      site:: [{{{siteName}}}]({{{url}}})\n      author:: {{{author}}}\n      date-saved:: [[{{{date}}}]]\n      id-wallabag:: {{{id}}}",
  "highlightTemplate": "> {{{text}}} [⤴️]({{{highlightUrl}}}) {{#labels}} #[[{{{name}}}]] {{/labels}}\n\n{{#note.length}}note:: {{{note}}}{{/note.length}}",
  "advancedSettings": "",
  "headingBlockTitle": "## 🔖 Articles",
  "loading": true,
  "syncJobId": 0,
  "version": "1.0.1",
  "apiVersion": "\"2.6.10\"",
  "apiToken": "redacted",
  "refreshToken": "redacted",
  "expireDate": 1736085091542,
  "isTokenExpired": false
}

hnykda · 2025-01-05T19:54:12Z

I have removed as much as I could (especially things that were not used/implemented). But it should use the same config as any other plugin. You still have some old fields there, but most of them are now ignored. You might remove this and start over, that way it will source only fields that are actually used.

luca-git · 2025-01-05T19:58:25Z

I run it this afternoon on linux, same issue as before. Not sure how to get a rid of the old config as i unistalled and reinstalled the plugin using the complied one, but no luck. Is it working on your end?

hnykda · 2025-01-05T20:04:48Z

By the same issue you mean that it's taking long?

To wipe the settings, you can go to Settings -> Plugins -> Wallabag sync -> Edit settings.json -> Remove everything and save. Then add any option to the config via UI and it will regenerate a new config.

luca-git · 2025-01-05T22:01:06Z

I mean not synching content after a very long process, but only titles. I figured out where the configs where and ran afresh a couple of hours ago.. will update as soon as I know.

luca-git · 2025-01-06T11:16:07Z

After resetting the configs and a lengthy process now the wallabag page is just empty. If it worls for you it's possibly me, both on windows and Linux for some reasons. Giving up.

hnykda · 2025-01-06T21:34:06Z

Darn. I am sorry, and I appreciate you trying and helping out, really. Will work on this further.

luca-git · 2025-01-06T22:39:52Z

Thanks for your efforts on this, it's important. Omnivore was such an amazing tool, my workflow would improve a lot if you succeed.

GuentherE · 2025-01-07T18:56:01Z

Congratulations Daniel!

I've de-installed the old version and deleted everything before installing the new one.

I've tried the latest develop version and it's working as expected: Content, url und highlight were synced. It took about 15 minutes for about 700 articles.

Author and published-at were not filled in wallabag either, but they were filled in for other fetched articles, so it's also working.

Only tags are not synced.

Now it would be great to get the multiple page feature back ... ;-)

GuentherE · 2025-01-07T19:40:26Z

During the import process it took a lot of cpu load. Probably logseq is processing the large amount of content into its fulltext engine. During the next sync it took the high cpu load again and it runs even longer. It seems that the plugin starts all over and fetches all articles again. Shouldn't it just fetch newer articles?

Without fetching content the whole first import process takes only about 20 seconds. For the next sync it took a lot more time (about 3 minutes)..

hnykda · 2025-01-07T20:35:09Z

Oh, thanks @GuentherE for testing, that is encouraging.

Author and published-at were not filled in wallabag either, but they were filled in for other fetched articles, so it's also working.

Yeah, we do capture an author if it's available in wallabag data, so if it's not there, it's rather an issue with wallabag parser and you might propose some feature there to get higher coverage.

Only tags are not synced.

Yeah, we do have them from wallabag but are not part of the template. I think this should be an easy fix.

Now it would be great to get the multiple page feature back ...

Yeah, as mentioned in the other issue, this is bigger. I am pretty sure it would take me like half a day, but that is very expensive for me.

During the import process it took a lot of cpu load. Probably logseq is processing the large amount of content into its fulltext engine. During the next sync it took the high cpu load again and it runs even longer. It seems that the plugin starts all over and fetches all articles again. Shouldn't it just fetch newer articles? Without fetching content the whole first import process takes only about 20 seconds. For the next sync it took a lot more time (about 3 minutes)..

This is useful profiling, thanks for that! It is indeed true that we fetch everything again, and then do some kind of a comparison to see if stuff got changed in the source (wallabag) 🤔 . We should add lastSyncedAt and use that in the filter when syncing again I believe.

luca-git · 2025-01-09T12:49:03Z

still trying to wrap my head around the issue, could you kindly share how the writing dir is built? I've notcide that there might be an issue there in my case. should i inset into the path of the pluging the whole path to my graph? (the one which has 4 dirs, in: assets, journals, logseq, pages?) Thanks.

hnykda · 2025-01-09T17:30:45Z

Sorry, writing dir? I am confused here. You can now install the current version of the plugin via official way through builtin plugin marketplace (remove the dev one before doing that).

luca-git · 2025-01-09T17:57:03Z

There's a parameter regarding the graph, I guess it uses that to choose where to write.

hnykda · 2025-01-09T19:25:09Z

Yes - what about it? That's not related to this plugin 🤔 . You can create as many graphs as you want and wherever you want. E.g. you can create a folder "logseqgraphs" on your desktop and create a new called "testingwallabag" and then opening that and setting that parameter to that one.

luca-git · 2025-01-09T19:29:01Z

I was hoping some wrong dir there would cause the issue but i tested it and i still get the good 'ol wallabag page, completely empty. When ill have time ill ask roocline

hnykda mentioned this issue Jan 2, 2025

Fetching articles fails with error "invalid time value" #1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tiltles synching, content is not, and URLs are incorrect. #2

tiltles synching, content is not, and URLs are incorrect. #2

luca-git commented Jan 1, 2025 •

edited

Loading

GuentherE commented Jan 1, 2025 •

edited

Loading

GuentherE commented Jan 1, 2025

hnykda commented Jan 2, 2025

hnykda commented Jan 2, 2025

hnykda commented Jan 2, 2025

luca-git commented Jan 3, 2025

luca-git commented Jan 5, 2025 •

edited

Loading

hnykda commented Jan 5, 2025 •

edited

Loading

luca-git commented Jan 5, 2025 •

edited

Loading

luca-git commented Jan 5, 2025

hnykda commented Jan 5, 2025

luca-git commented Jan 5, 2025

hnykda commented Jan 5, 2025

luca-git commented Jan 5, 2025

luca-git commented Jan 6, 2025

hnykda commented Jan 6, 2025

luca-git commented Jan 6, 2025

GuentherE commented Jan 7, 2025 •

edited

Loading

GuentherE commented Jan 7, 2025 •

edited

Loading

hnykda commented Jan 7, 2025

luca-git commented Jan 9, 2025

hnykda commented Jan 9, 2025

luca-git commented Jan 9, 2025

hnykda commented Jan 9, 2025

luca-git commented Jan 9, 2025 •

edited

Loading

tiltles synching, content is not, and URLs are incorrect. #2

tiltles synching, content is not, and URLs are incorrect. #2

Comments

luca-git commented Jan 1, 2025 • edited Loading

GuentherE commented Jan 1, 2025 • edited Loading

GuentherE commented Jan 1, 2025

hnykda commented Jan 2, 2025

hnykda commented Jan 2, 2025

hnykda commented Jan 2, 2025

luca-git commented Jan 3, 2025

luca-git commented Jan 5, 2025 • edited Loading

hnykda commented Jan 5, 2025 • edited Loading

luca-git commented Jan 5, 2025 • edited Loading

luca-git commented Jan 5, 2025

hnykda commented Jan 5, 2025

luca-git commented Jan 5, 2025

hnykda commented Jan 5, 2025

luca-git commented Jan 5, 2025

luca-git commented Jan 6, 2025

hnykda commented Jan 6, 2025

luca-git commented Jan 6, 2025

GuentherE commented Jan 7, 2025 • edited Loading

GuentherE commented Jan 7, 2025 • edited Loading

hnykda commented Jan 7, 2025

luca-git commented Jan 9, 2025

hnykda commented Jan 9, 2025

luca-git commented Jan 9, 2025

hnykda commented Jan 9, 2025

luca-git commented Jan 9, 2025 • edited Loading

luca-git commented Jan 1, 2025 •

edited

Loading

GuentherE commented Jan 1, 2025 •

edited

Loading

luca-git commented Jan 5, 2025 •

edited

Loading

hnykda commented Jan 5, 2025 •

edited

Loading

luca-git commented Jan 5, 2025 •

edited

Loading

GuentherE commented Jan 7, 2025 •

edited

Loading

GuentherE commented Jan 7, 2025 •

edited

Loading

luca-git commented Jan 9, 2025 •

edited

Loading