Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: only keep new parts #15

Open
jbostoen opened this issue Apr 14, 2022 · 2 comments
Open

Feature: only keep new parts #15

jbostoen opened this issue Apr 14, 2022 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@jbostoen
Copy link
Owner

jbostoen commented Apr 14, 2022

Note: this feature can be sponsored!

Use case: agent responds from iTop to the client. E-mail is sent to the client; client hits "reply". Automatically the e-mail message with the agent's reply is included.

It may be interesting in an e-mail based ticket system to only keep the "new" part of the email and strip the other parts.

Scenarios to consider, which complicate the implementation:

Just splitting by a basic pattern or occurrence of certain HTML is a bad idea:

  • people will answer inline in the original text.
  • people will sometimes forward e-mails; which triggers some email clients to add HTML markup which we could have used to detect "the original message" from our use case. However, this time it would be stripped incorrectly.
  • different e-mail clients use different HTML tags and sometimes manipulate content

Ideal situation would be if log entries could be queried separately. However, iTop still doesn't support this.

Other difficulties: the current implementation of Mail to Ticket (Combodo and this fork) build a description and also process inline images etc, adding new links with random IDs.

So the basic approach, considering the current limitations:

  • fetch case log entries; process in reverse order (most recent first as this is most likely replied to). Do not just consider the latest one.
  • build the description of the new e-mail. Replace (some/all?) URL structures. Compare against the case log entries. Once there's a match, strip and stop processing.
  • should also consider that during original processing, inline images may be added that actually get stripped (and should be removed again!)

Known limitations:

  • might not work well with short/similar replies, but this would be rare.
  • if two identical emails get sent for some reason, it's possible an empty case log entry would be created?

Investigation needed:

  • how much of the tags clients add are removed by iTop? Is different formatting (whitespace) an issue?
    • seems to be covered by iTop
  • what if the original message is within a div with some mark up, should this be configurable for removal? What are the defaults? This might be related to the "basic"/non-intelligent approach on how to strip content?
  • what if there's just a reply where for instance " >" is added in front of each line?
@jbostoen jbostoen added the enhancement New feature or request label Apr 14, 2022
@jbostoen
Copy link
Owner Author

Another interesting note in the design: MS Outlook for example adds this HTML

<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Some Name &lt;[email protected]&gt;<br>
<b>Sent:</b> Friday, December 15, 2023 9:59:10 AM<br>
<b>To:</b> Jeffrey Bostoen &lt;[email protected]&gt;<br>
<b>Subject:</b> RE: something</font>
<div>&nbsp;</div>
</div>
...

Now, note that iTop does HTML sanitizing; and that this has changed a bit over time already (for example: ordered/unordered lists).

https://www.itophub.io/wiki/page?id=latest:admin:rich_text_limitations

So when developing something, it may be worth considering whether this is compared against the original HTML (so also -temporarily- store the original HTML rather than the sanitized one).

@jbostoen jbostoen self-assigned this Dec 28, 2023
@jbostoen
Copy link
Owner Author

jbostoen commented Jan 3, 2024

French version:

De : Abc Def [email protected]
Envoyé : vendredi 29 décembre 2023 14:42
À : Abc Def [email protected]
Cc : [email protected]
Objet : [ R-000322 ] iTop: xxx

In iTop, after sanitizing, all the above content was in a SPAN element.

Also a point to consider: inline images (links).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant