Jump to content
 

New skin testing - particularly for anyone who's had problems with editing/quoting etc.


Recommended Posts

  • RMweb Gold

Normally Notepad (rather than Wordpad) will strip out the XML or HTML coding, or have Microsoft gone and improved that now? ;)

 

It shouldn't matter now. Anything now typed, pasted from Notepad, etc., will be converted to UTF8 encoding by the editor.

 

The problem applies only to the existing contents of the database posted before the software upgrade.

 

Martin.

Link to post
Share on other sites

Thanks for staying on the matter, Andy! I guess this is a fine example for issues one would very much prefer to do without, so I certainly appreciate your patience.

 

I most certainly could have done without it Dom, particularly this week. I knew there was a fundamental security issue in that some php scripts were being uploaded within the community software which was generating spam email (not to RMweb members!) over the last month or more. Once a fix was available for that I knew that the software update would be necessary as part of cleaning every corner, the problem came with Invision's reluctance to support the upgrade process which would have meant I'd have been on my own in fixing everything if anything did go wrong (when you have these gut feelings that something may go wrong it's always best to listen to your gut). It's their way of trying to steer everyone onto version 4 of the software which I'm not wholly happy to move to yet for several reasons.  Invision did agree to support the upgrade process (after pointing out it was their supplied product which was the issue) and coughing up for some support time; it needed quite a lot of preparation both here and there, mainly because of the size of the site and this caused the outages earlier in the week.

 

This truncation problem however stems from something that didn't need to be done at present but is advisable when moving to V4 so it doesn't matter how well you prepare, something else will happen.

 

This week was just a bad week, following on from a long weekend at Doncaster show to still be sitting here into the early hours each morning relaying messages between hosting and software support and chewing my fingernails wondering if/when the site will return after it's stalled/broken (again).

Link to post
Share on other sites

  • RMweb Gold

It shouldn't matter now. Anything now typed, pasted from Notepad, etc., will be converted to UTF8 encoding by the editor.

 

The problem applies only to the existing contents of the database posted before the software upgrade.

 

Martin.

My blog entries from the past year or so have been through Notepad before pasting onto RMweb, which is why I 'm a little puzzled.

Link to post
Share on other sites

  • RMweb Premium

Back while I still had my web server up and running in London at the beginning of last year, I was working on a basic contact form system for it. One problem I had was character coding. I wanted to restrict what users could type into certain fields, and the mask for the characters wasn't working properly with certain characters (like £ signs, etc). It turned out that the £ sign was a trying to be UTF-8 as typed in by the user, but wasn't UTF-8 when typed in to the mask in the code! So checking for the £ wasn't a match due to one being ASCII and the other being UTF-8. I don't think I was able to fully resolve it before I got kicked out of the house, so it's still sitting on the server in my container storage waiting for a fix...

 

As for Notepad, it should strip out any formatting, but will leave the characters in ASCII (I think), but doesn't play well with carriage returns and line feeds compared to other text editors. Which is why I switched to using Notepad++ on Windows. I still use Notepad++ on Linux via Wine, despite the existence of its clone Notepadqq for Linux as it requires a Qt upgrade which I haven't figured out how to do yet. :scratchhead:

Link to post
Share on other sites

  • RMweb Gold

Have just tried expanding the reply box, after using the #'Reply quoting this post' option - I used the three diagonal lines bottom right to drag the box larger.

 

The box kept re-sizing and scrolling up the window.  It doesn't really want to play properly without a quote in the box either.

Link to post
Share on other sites

 

you may want to suggest that members stop using pound signs for a while, and instead use the currency code GBP

and I though David Cameron was supposed to be negotiating so that we were not going to be forced to adopt some daft other currency!

 

It would be nice, but too much to ever hope for, that while they "fix" the THEIR latest screwup on fonts (change for change sake - who wanted it anyway) they can fix the

 
sprinkler/leak.

 

Just testing £ ! - [Edit] no, that didn't work! £

 

but

£
Works! just a nuisance to work-around programmer's incompetence. :( and just typical of a US based company to believe that the world revolves around their $ !
Link to post
Share on other sites

  • RMweb Premium

Have just tried expanding the reply box, after using the #'Reply quoting this post' option - I used the three diagonal lines bottom right to drag the box larger.

 

The box kept re-sizing and scrolling up the window.  It doesn't really want to play properly without a quote in the box either.

 

I've just switched to the 2016 theme and I'm not having any problems on Firefox 43.0.1 on Linux Mint 17.3, it seems to resize OK.

Link to post
Share on other sites

  • RMweb Gold

My blog entries from the past year or so have been through Notepad before pasting onto RMweb, which is why I 'm a little puzzled.

 

Hi Mikkel,

 

Notepad converted them to plain text, without any formatting (bold, etc.).

 

But before the upgrade the RMweb editor was encoding it in a Latin character set in the database. In which a pound symbol is represented like this: 10100011.

 

The RMweb software now expects to find text encoded in the UTF8 character set. In which the above means nothing, and a pound symbol should be: 0000000010100011.

 

Andy says IPS are now running a conversion operation on every post in the database (over 2 million of them), to convert the old to the new. It may take some time.

 

The problem goes back to the original 7-bit code for teleprinter devices (ASCII) which was an American invention and contained only one currency code, for $. For a long time after that, pound signs were represented as # symbols, which you can still see occasionally. If we were still doing that we wouldn't have a problem now, because the code for $ and # is the same in UTF8 as in the original code.

 

regards,

 

Martin.

Link to post
Share on other sites

  • RMweb Gold
It would be nice, but too much to ever hope for, that while they "fix" the THEIR latest screwup on fonts (change for change sake - who wanted it anyway) they can fix the

 
sprinkler/leak.

 

Hi Kenton,

 

I think that problem is actually in the free CK Editor which IPS are using:

 

 http://ckeditor.com/

 

It's free and open-source, so in theory you could fix it yourself. smile.gif

 

regards,

 

Martin.

Link to post
Share on other sites

...

 

The RMweb software now expects to find text encoded in the UTF8 character set. In which the above means nothing, and a pound symbol should be: 0000000010100011.

 

 

 

Point of order, which probably no-one else will be interested in:

Surely a pound symbol in UTF8 would be 11000010 10100011 (UTF-8 being variable-length encoding, and the pound being U+00A3 or C2 A3 in hex?) Wouldn't the version with all the leading 0s in UTF-8 be equivalent to a Unicode null followed by an invalid character? 

 

Every time I look up this stuff I think I understand it, then have to look it up again half an hour later.

Link to post
Share on other sites

  • RMweb Gold

Point of order, which probably no-one else will be interested in:

Surely a pound symbol in UTF8 would be 11000010 10100011 (UTF-8 being variable-length encoding, and the pound being U+00A3 or C2 A3 in hex?) Wouldn't the version with all the leading 0s in UTF-8 be equivalent to a Unicode null followed by an invalid character? 

 

Every time I look up this stuff I think I understand it, then have to look it up again half an hour later.

 

You are correct. I was trying to keep it simple. :(

Link to post
Share on other sites

 

Hi Kenton,

 

I think that problem is actually in the free CK Editor which IPS are using:

 

 http://ckeditor.com/

 

It's free and open-source, so in theory you could fix it yourself. smile.gif

 

regards,

 

Martin.

 

 

The issue is more about why did it change and why do I get different results with the "replying to" and the "quick reply options? They really should be consistent! using the same encoding routines.

 

Even that example above was interesting, sometimes it accepts the html codes (like the < br /> to give a new line other times it just accepts a keyed CR.) It is the inconsistency that is the problem.

 

If it was so popular (and as good as it claims from that link) the the   sprinkler would have been noticed (and fixed before now).

 

I also get different quantities of the no-break-space depending on who posts and almost the time of day they post. There is always one at the start of quoted text and after it (that must be placed there by the software. some quoted posts are next to unreadable.

 

As very few (I'm not quite the only one) get this problem I have concluded it must be some js processing of the input text! (as just about everyone knows I block js and most other internet crud) That must mean that it is in the bad coding inside the 3rd party js.

 

I'd look at that code, but as I'm expecting RMWeb to go off line with the next update it would be a waste of my time. Besides I know how precious the little darlings are over their "innovative" "fully featured" code. I'd probably have the desire to rewrite the lot. (I'm too old to give a damn these days)

Link to post
Share on other sites

  • RMweb Gold

Andy - I've been testing this theme for a couple of days with no new problems.  However, one problem I have is that whenever I quote another post, the name of that poster, date etc. is repeated in the body of the quote.  And if I edit the post again it adds yet another line.  It happens on all themes and only when I use IE 11, but not when I use Edge.  My OS is Windows 10. The strange thing is I haven't managed to find any other posters that appear to suffer the same issue.

 

It happens on all the PCs I use so perhaps it's a profile issue?

 

Andy - I've now traced this issue to the use of Adblock Plus 1.5 for IE 11.  If it's enabled, the title line of the quote is repeated in the text of the quote (and again after every edit), but if I start IE 11 with this add-on disabled it doesn't do it.  So I assume it's something to do with the way the forum software package deals with this add-on being enabled.  I've not seen any other issues with this site otherwise.

 

Resetting IE fixed the problem only because in so doing it disabled the add-on.

Link to post
Share on other sites

  • RMweb Gold

If it was so popular (and as good as it claims from that link) the the &nbsp; sprinkler would have been noticed (and fixed before now).

 

Hi Kenton,

 

It's popular, but known to be buggy and not everyone likes it. A nice editor is this one from Romania:

 

 https://www.froala.com/wysiwyg-editor

 

regards,

 

Martin.

Link to post
Share on other sites

FWIW, I've looked back through some of my (now truncated) blog posts.  It's not just the pound sign that causes truncation but several 'special' characters, such as a hyphen, double quotation marks (which I've used for inches), apostrophe, and, no doubt many more.  Few entries seem to have escaped unscathed!  I usually used Notepad to prepare entries for posting.

 

Mike

Link to post
Share on other sites

  • RMweb Premium

FWIW, I've looked back through some of my (now truncated) blog posts.  It's not just the pound sign that causes truncation but several 'special' characters, such as a hyphen, double quotation marks (which I've used for inches), apostrophe, and, no doubt many more.  Few entries seem to have escaped unscathed!  I usually used Notepad to prepare entries for posting.

 

Mike

 

This is because a number of the 'special' characters aren't in the ASCII 7-bit set. They might have been in the 8 bit set (extended) but since that bit has been used to help point to the UTF character sets, those codes are 'no longer valid' and have to be replaced (which will be what the Invision team are doing with the old posts at the moment).

 

https://en.wikipedia.org/wiki/ASCII - 0 to 127

 

https://en.wikipedia.org/wiki/Extended_ASCII - 128 to 255 - these are not compatible with UTF-8

 

http://www.ascii-code.com/ - the complete 0 to 255 of the ASCII characters

Link to post
Share on other sites

This is because a number of the 'special' characters aren't in the ASCII 7-bit set. They might have been in the 8 bit set (extended) but since that bit has been used to help point to the UTF character sets, those codes are 'no longer valid' and have to be replaced

Well, if the coding system that has been used for years is to be replaced, it seems that an awful lot of information in databases across the world will be 'lost'.  It's a good thing that the ancient Egyptians did not have computers or we would know nothing about their civilisation :)

Link to post
Share on other sites

  • RMweb Gold

Well, if the coding system that has been used for years is to be replaced, it seems that an awful lot of information in databases across the world will be 'lost'.  It's a good thing that the ancient Egyptians did not have computers or we would know nothing about their civilisation :)

 

Hi Mike,

 

It gets lost only if the changeover isn't done properly.

 

Martin.

Link to post
Share on other sites

  • RMweb Premium

Well, if the coding system that has been used for years is to be replaced, it seems that an awful lot of information in databases across the world will be 'lost'.  It's a good thing that the ancient Egyptians did not have computers or we would know nothing about their civilisation :)

 

Well, 8 bit ASCII was fine for a long time as most of the world of computing (a relatively small number of people) was content using the Latin characters that ASCII represented, and Microsoft was happy to use the 8 bit set which meant most PCs were using 8 bit ASCII and had to have everything in English (or the other very similar Latin/Germanic languages).

 

Unfortunately/fortunately, the rest of the world decided it liked using computers, smartphones, etc, and wanted their computers to be able to use their characters. So, to be able to represent all of that (including Egyptian hieroglyphs, maybe) more space was needed to encode all the different characters, hence Unicode, and UTF-8, an extensible character set. It's something that's really only happened in the last decade or so since the internet and smartphone explosion reached out to the non-English character world. It was predicted earlier, of course, but like so many of these things (IPv4 vs IPv6) it's taken a while for it to be critical enough to get noticed and corrected.

Link to post
Share on other sites

Andy, I saw this once before and don't know if it's related to the (special character + what exactly?) post mangling that's going on presently but, 

 

Here: http://www.rmweb.co.uk/community/index.php?/topic/108082-Hornby-issues-profit-warning/page-25 right after Mike's post 617, all text is italicized for the rest of the page.

 

I don't know if this is related to new fomats being available or the recently introduced viewing issue but it's odd.

 

EDIT: I'd say it's related to the post mangling bug. It happens here too and in the new theme as well.  Clearly something changed. Can we back it out?

Link to post
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...