User talk:Spitzak
|
MS-DOS 1.0
[edit]It used the FAT-12 filesystem on 160kb single-sided 8-sector 5¼"-inch floppies. It was extremely primitive in some respects, yet still a great advance over commonly-used CP/M filesystems, since the exact file length, file modification date and time, etc. were recorded. Subdirectories were added in DOS 2.0, yet the DOS 1 directory entry format remained unchanged until the introduction of LFNs in Windows 95... AnonMoos (talk) 12:29, 2 December 2009 (UTC)
UTF-16
[edit]Hi, I reverted you deletions in UTF-16, see edit summary. Probably you have a point in some deletions, but I did not see that in the whole. btw for my understanding, the thing "word" (as a bitlength unit) is not used in Unicode, so that makes it hard to understand for me. -DePiep (talk) 22:48, 21 June 2010 (UTC)
License tagging for File:Unicode 2400 Chrome Ubuntu.png
[edit]Thanks for uploading File:Unicode 2400 Chrome Ubuntu.png. You don't seem to have indicated the license status of the image. Wikipedia uses a set of image copyright tags to indicate this information; to add a tag to the image, select the appropriate tag from this list, click on this link, then click "Edit this page" and add the tag to the image's description. If there doesn't seem to be a suitable tag, the image is probably not appropriate for use on Wikipedia.
For help in choosing the correct tag, or for any other questions, leave a message on Wikipedia:Media copyright questions. Thank you for your cooperation. --ImageTaggingBot (talk) 07:05, 26 October 2010 (UTC)
Imposter
[edit]I blocked and cleaned up after that person who was trying to impersonate you. -- Gogo Dodo (talk) 05:50, 8 April 2011 (UTC)
about deletion of the information from the section strcat
[edit]i'm not getting why you have deleted the information on the page strcat from the section strcat_s. here i'm going to undo it. if you have any problem tplease drop message on my talk page. and give your suggestions. == Prasannjit Gondchawar (talk) 19:17, 1 October 2011 (UTC)
Disambiguation link notification for March 13
[edit]Hi. When you recently edited Code page 437, you added links pointing to the disambiguation pages !! and 1/4 (check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 10:58, 13 March 2012 (UTC)
"Seek is O(1) in code units."
[edit]Can you give an algorithm demonstrating that? To find the nth character, do you not have to examine the preceding ones to determine that you indeed have the nth? (Of course, there are other issues here as well: the O(1) algorithm for char* is obviously at risk for buffer overruns, e.g., unless you have a solid upper bound, and can be fooled even then.) -- Elphion (talk) 16:27, 21 April 2012 (UTC)
- Oh, I see: I was misreading "code unit" as "character". But this is not interesting: no one is interested in seeking to the nth code unit unless you already have something like a lut for the string converting Char(n) into CU(m) (or more broadly, a list of starting points that you interested in -- beginning of paragraphs, etc. -- i.e., something you get by already having scanned the text). -- Elphion (talk) 16:32, 21 April 2012 (UTC)
- I should be clearer: I am "on your side" here -- the argument that UTF8 or UTF16 strings can't be treated as arrays is a red herring, because strings shouldn't be treated as arrays until they have been thoroughly scanned. If one truly needs an array of the characters, it can be built during the scan. But the argument that seeking the nth CU is O(1) is irrelevant to this. -- Elphion (talk) 16:53, 21 April 2012 (UTC)
- The mistake you are making is thinking that there is a need to count "characters" at all. First of all the word "character" is poorly defined in Unicode (it depends on the interpreted normalization and quite a few code points may not be "characters" so it is impossible to count them except by string scanning, in any encoding. I suspect however you mean "Unicode code points" when you say "characters". Or perhaps "UTF-16 code units" (where Unicode code points greater than U+FFFF are 2 units). I hope you can see from even these few examples, where I am unsure what you intend, why talking about "characters" is a bad idea.
- In any case this makes as much sense as saying there is a need to find the N'th word or letter 'x' or anything else in O(1) time. There is no need for this, and text processing is quite fast despite the inability to do searches in less than linear time. The problem is that you need to remember *offsets* into strings and it is desirable to turn an offset into a pointer to the character in O(1) time. The obvious solution that any programmer should think of is to use fixed-size units for this "offset", in fact it is such a no-brainer that it seems hard to believe anybody would ever think otherwise. However decades of indoctrination where every man page says "characters" when talking about offsets seems to have turned even experienced programmers into complete morons when they encounter UTF-8.Spitzak (talk) 01:57, 24 April 2012 (UTC)
Lexicographic
[edit]Hi -- not arguing the merits of your change, but pointing out that since the wording was being discussed on talk page, that's where you should have floated the change. Otherwise it's a quick descent into edit warring! -- Elphion (talk) 13:04, 18 September 2012 (UTC)
Nak
[edit]Please don't remove sourced content. A "nak" is a female "yak" - it's in the yak article. Rklawton (talk) 03:13, 23 October 2013 (UTC)
About the slashes
[edit]About Slash_(punctuation)#Encoding. The facts are clear (including the Unicode mislead), but I think we could get the prose better.
How about the section intro setup like: "Slashes are encoded in Unicode as ... and ...". But the Unicode naming is controversial/disputed.
(then the next paragraph says:) Typographically ... (zoom in on diffs).
One issue is, we should not push both definition and naming issue in one paragraph "encoding". What do you think? -DePiep (talk) 18:26, 17 April 2014 (UTC)
Du erhältst einen Orden!
[edit]![]() | Der Detailorden |
For your FLTK change Polluks ★ 12:10, 28 August 2014 (UTC) |
Unicode terminology - BOM vs. Signature, code unit, ...
[edit]For the term Unicode Signature, see http://www.unicode.org/versions/Unicode9.0.0/UnicodeStandard-9.0.pdf, chapter 2.13 Special Characters ("Unicode Signature. An initial BOM may also serve as an implicit marker to identify a file as containing Unicode text. ") and the following tables: Table 23-6, Table 23-7
A code point is the abstract form of a character, irrespective of its encoding.
A specific method of encoding code points, irrespective of endianness, is an encoding form.
A code unit is the basic unit of encoding in a given encoding form: one or more code units encode a single code point: 1-4 units in UTF-8, 1-2 units in UTF-16 (if 2 units are needed, they're called a surrogate pair), ...
An encoding form with specific endianness is an encoding scheme.
Strictly speaking, BOM is the single code point (character) U+FEFF, irrespective of its encoding.
The byte sequence that results when you encode the BOM character using a specific encoding scheme is what is loosely also called a BOM. More accurately, such a byte sequence is an encoding of the BOM character.
However, given that these byte sequences are also used to identify encoding schemes to which the concept of byte order doesn't apply - such as UTF-8 and UTF-7, which use bytes as the code units - it is better to call these byte sequences Unicode signatures.
tab
[edit]So fixing the image might be a good idea , reverting to previous version is not the best solution. DGerman (talk) 22:40, 3 May 2017 (UTC)
Diff: https://en.wikipedia.org/w/index.php?title=UTF-8&oldid=prev&diff=781055797
Your comment is """Apparently not valid UCS-2, those code points are called "invalid" in *all* unicode forms)""", which is not correct. As you can see in https://en.wikipedia.org/wiki/UTF-16#U.2BD800_to_U.2BDFFF , """UCS-2, UTF-8, and UTF-32 can encode these code points in trivial and obvious ways""", and in fact, *this* is the difference between UTF-16 and UCS-2: UTF-16 doesn't encode the surrogates range (<U+D800..U+DFFF>), but encodes non-BMP characters (<U+10000 to U+10FFFF>), but UCS-2 encodes the surrogate code points (<U+D800..U+DFFF>), but nothing out of BMP. Behnam (talk) 22:15, 18 May 2017 (UTC)
- Unicode has declared that the code points for surrogate halves are invalid. They are equally unable/able to be encoded in UCS-2 as UTF-16. It is trivial to insert these codes into *both* encodings, which means any "validity" is equal in both. If a high and low surrogate half happen to be next to each other, most of modern Windows will interpret that as a single Unicode code point, therefore the text is UTF-16, not UCS-2 (which would require Windows to interpret it as two invalid code points).Spitzak (talk) 23:18, 18 May 2017 (UTC)
- Unicode Surrogate Pair characters are not "invalid code-points", they are "valid code-points" of type Surrogate, but, yes, they are not "Unicode Scalar Values". (See http://unicode.org/glossary/#unicode_scalar_value , http://unicode.org/glossary/#code_point and http://unicode.org/glossary/#code_point_type). Anyways, the whole point of my change was to help readers understand why those code-unit/code-points are allowed in Windows and other systems in the first place (because they were based on UCS-2, before UTF-16 existed), where those values are valid code-points. Behnam (talk) 05:10, 19 May 2017 (UTC)
- The reason they are allowed in Windows filenames is that it is VASTLY simpler to implement the file system that way. Not for any idea of whether a code point is "valid" in Unicode. Several valid code points like '/' are not allowed in the filenames (I believe, though possibly the Win32 api has a non-path api to name files which would allow these too). A huge problem currently is code that believes these code points will magically not happen because they are "invalid" or whatever, and saying that changing to UCS-2 verses any other encoding somehow changes these code points validity is IMHO very harmful to getting the idiot savants who write things to stop doing such stupid actions.
UTF-16
[edit]Thanks for clearing that up. It was completely opaque before; I guessed one interpretation, trusting that someone would correct it if I guessed wrong! -- Elphion (talk) 06:01, 12 April 2018 (UTC)
Yes, chcp 65001 is a thing
[edit](Moved to Talk:Unicode_in_Microsoft_Windows#Yes, chcp 65001 is a thing)
--Artoria2e5 contrib 16:29, 9 May 2018 (UTC)
Disambiguation link notification for June 5
[edit]Hi. Thank you for your recent edits. An automated process has detected that when you recently edited Meta key, you added a link pointing to the disambiguation page Super key (check to confirm | fix with Dab solver). Such links are usually incorrect, since a disambiguation page is merely a list of unrelated topics with similar titles. (Read the FAQ • Join us at the DPL WikiProject.)
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 09:48, 5 June 2018 (UTC)
HP 2000A
[edit]I undid your redo on the 2100 article that stated the 2000A was a dual-CPU machine. It was not, see ftp://ftp.mrynet.com/pub/os/HP2000/pdf/02000-90003_sitePrep_Nov69.pdf. The dual-CPU configurations started with the 2000B; it would have been difficult to fit two machines into the A model due to its much larger core. Maury Markowitz (talk) 11:20, 10 July 2018 (UTC)
- Was just restoring some old text, I have no idea. It sure sounded like there was one computer that ran BASIC, and another that did all the I/O to the terminals.Spitzak (talk) 18:31, 10 July 2018 (UTC)
Ordinal indicator
[edit]Understood, now. Thanks! Code Page Guy (talk) 11:22, 19 September 2018 (UTC)
Roman Numerals (rules)
[edit]I've cut to new section(s) on this long running battle - just a little concerned that your comment might get lost from being stuck on the end of the old one? In fact I nearly moved your last remark down to the bottom, but of course this is (rightly) against the guidelines. Anyway, do make sure you have a look at the current situation at the bottom of the page. Xcalibur's rules, I think we agree, are totally unusable - I have substituted by own idea of what a set of rules might look like - including the rule itself (in simple, unambiguous terms) followed by notes and examples where necessary. But I really think I did a fair job in getting rid of the rules altogether and I still haven't seen a coherent argument to the contrary! --Soundofmusicals (talk) 03:59, 21 November 2018 (UTC)
EBCDIC again
[edit]for (c='A';c<='Z';++c)
becomes
char alfa="ABCD...WXYZ"; for(i=0;i<=25;i++)<ref to alfa[i] instead of c>
No procedure call required. Not much of a change. Apologies for not remembering C very well, I haven't used it in years. Peter Flass (talk) 17:03, 21 November 2018 (UTC)
- That requires keeping an extra variable that is unnecessary in ASCII and occupies 27 bytes one of which is unused. This sort of stuff is exactly what programmers hated back in the day.Spitzak (talk) 19:04, 21 November 2018 (UTC)
December 2018
[edit] You currently appear to be engaged in an edit war according to the reverts you have made on Arabic numerals; that means that you are repeatedly changing content back to how you think it should be, when you have seen that other editors disagree. Users are expected to collaborate with others, to avoid editing disruptively, and to try to reach a consensus, rather than repeatedly undoing other users' edits once it is known that there is a disagreement.
Points to note:
- Edit warring is disruptive regardless of how many reverts you have made;
- Do not edit war even if you believe you are right.
If you find yourself in an editing dispute, use the article's talk page to discuss controversial changes and work towards a version that represents consensus among editors. You can post a request for help at an appropriate noticeboard or seek dispute resolution. In some cases, it may be appropriate to request temporary page protection. If you engage in an edit war, you may be blocked from editing. Kautilya3 (talk) 15:04, 1 December 2018 (UTC)
Alt-keycodes insight
[edit]Thanks for your input on pound sign. Your comment (in the edit history) that "on some setups any number equal to 156+n*256 will work, so this is in fact redundant with the 156 example" is interesting, and possibly interesting enough to be included within the article itself, or maybe a more general article such as Windows Alt keycodes - as there are several websites suggesting Alt-6556, and this could be clarified. But can you provide any references to demonstrate that this +n*256 principle is true? And is there a way of determining when (i.e. in which combination of codepage/default locale/language_used_for_non-Unicode_programs/IME/language settings/typeface etc. etc.) this particular combination will produce the pound sign? In my own case (see e.g. Talk:Pound_sign#Explanation_for_Alt_keycode_6556?, I am frequently (but rather randomly: I cannot detect the pattern) unable to enter it via any combination of Alt- 156/0163/6556 etc. As demonstrated at the talkpage, it seems to be connected with codepage 850, but as to why, that is still unclear too. Ozaru (talk) 12:32, 15 December 2018 (UTC)
Created a safe wiki for Unicode subsets
[edit]Guy Macon is a digital version of a murderer. Safe information for Unicode subsets such as WGL4 is now found at https://unicode-subsets.fandom.com/wiki/Unicode_subsets_Wiki . 181.10.158.74 (talk) 11:24, 12 January 2019 (UTC)
XIX and XVIIII
[edit]Re your reversion of my edit to Roman numerals: as attested by all the examples cited in the article and in the talk pages, "XVIIII" has always been an acceptable way to write "19", from the Roman times (check Julius Caesar) through medieval times. Ditto for the other additive notations for digits 4 and 9, such as VIIII, XIIII, XXXX, LXXXX, etc. The subtractive notations "IV", "IX", "XL" have always been regarded as abbreviations of the additive ones, not as the only "right" way to write those numbers; much like we see "Dr." as merely an abbreviation of "Doctor", not as having superseded it. The Romans almost always used the subtractive notation for the same reason that we almost always write "Dr. Smith" instead of "Doctor Smith": because it saved 50% or more in strokes, time, and space. The only exception is IV, that saves only 25% on all three counts; and that is obviously why "IIII" is much more common than "VIIII" or "XXXX".
Note, for example, that the longest additive numeral from 1 to 12 is "VIIII", that takes the space of six "I" letters. If a clockmaker were to use additive notation for all numbers he would have to leave that much space for each numeral. If he uses "IX" for 9 instead, the longest numeral will be "VIII", that takes only five "I"-slots -- quite a bit easier to fit, and wasting much less space overall. But then using "IV" instead of "IIII" makes no difference, except saving a little bit of metal; and "IIII" looks nicer because it visually balances the "VIII" on the other side.
On the other hand, notations like "IIXX" and "IIX" (or "XIIX") are extremely rare, and their contexts indicate either linguistic induction (as in Romans saying "duodevigesimo" = "two-from-twentieth" for "18th", or "duoetvicensimo" = "two-and-twentieth" for "22nd") or simple scribal/stonecutter error. As for the former, it seems that the 18th and 22nd Roman Legions were often written as "XIIX LEGIO" and "IIXX LEGIO", respectively.
There is even an example where a stonecutter was commissioned to chisel "IIXX LEGIO" for "22nd Legion" but "autocorrected" that to "XVIII LEGIO" instead. (See the Talk page for the source.). So, while no one ever mocked Caesar for writing "XVIIII" and the like all over De Bello Gallico, we must conclude that at some time in the late Roman Age there was at least one adult human in the Roman Empire who felt that "IIXX" was definitely "wrong". (Just as we must conclude that there is at least one cow in Scotland that is black on the left side.)
All the best, --Jorge Stolfi (talk) 01:49, 3 May 2019 (UTC)
- XVIIII is talked about in another paragraph as an alternative. I notice that paragraph does not say "instead of IXX" so there is precedent for not listing every possible alternative when describing each of them.Spitzak (talk) 17:47, 3 May 2019 (UTC)
- The point is that, based on copious evidence from actual texts, Roman and Medieval authors would consider both "XIX" and "XVIIII" as valid alternative ways to write "19"; whereas they apparently regarded "IXX" as "wrong". So I was trying to say "instead of the commonly accepted forms". --Jorge Stolfi (talk) 22:45, 3 May 2019 (UTC)
- I agree, it's just that the section that says "XVIIII is sometimes used instead of XIX" does not say "XVIIII is sometimes used instead of XIX or IIXX". Same should be true of this one. If there are N alternatives there is no reason to write N paragraphs, each saying "x is used instead of x1, x2, ...xN".Spitzak (talk) 23:15, 3 May 2019 (UTC)
- The point is that, based on copious evidence from actual texts, Roman and Medieval authors would consider both "XIX" and "XVIIII" as valid alternative ways to write "19"; whereas they apparently regarded "IXX" as "wrong". So I was trying to say "instead of the commonly accepted forms". --Jorge Stolfi (talk) 22:45, 3 May 2019 (UTC)
CD versus CCCC
[edit]Just to pick a nit,
- "CD" is normally written with 3 strokes with same total length as ~5 "I"s, and uses the space of 4 "I"s
- "CCCC" is 4 strokes with total length ~8 "I"s, and uses the space of 8 "I"s.
So the savings for "CD" are 25% in stroke count (roughly a measure of writing effort/time), ~38% in total stroke length (a measure of chiseling work and metal requirement) and 50% in space. That is why I had not claimed a flat "50% savings" for it.
All the best, --Jorge Stolfi (talk) 07:48, 17 May 2019 (UTC)
- Yes but this "50%" in stroke count is wrong for several of the other samples as well. I'm not sure where this "stroke count" stuff came from, in all examples I have seen the one with less "stroke count" also has fewer characters. For instance you could say "stroke count" is why "IIV" is not used instead of "III" but "IIV" also has just as many characters. About the only interesting fact is that "IIII" is not twice as wide as "IV" which may explain why "IIII" is used more often than other alternatives.Spitzak (talk) 19:35, 17 May 2019 (UTC)
"Common patterns" in Roman numerals
[edit]Hi – in your latest edit to Roman numerals you changed:
- "but these are arranged in a common pattern that remains constant for each power or "place".
- "but these are arranged in a common pattern that remains constant for each power or "place".
To:
- "but a common pattern is used for each of them".
- "but a common pattern is used for each of them".
I can live with this, to be honest – in spite of recent accusation of WP:OWN I do try to make a habit of only reverting well-meant edits when important information has been lost, or a misleading or frankly erroneous impression has been created.
On the other hand my original wording in this case was very carefully considered – in the light of persistent misunderstanding of what the paragraph was actually about (it is always safe to assume, I suspect, that the well-intentioned editor is less, rather than more liable to confusion than the average reader). In particular, I was a little apprehensive that my use of the phrase "common pattern" might be misunderstood, especially by a careless reader, or one with a less than perfect command of English.
Obviously, you understood perfectly exactly what I meant, and were able to put it more succinctly, but I do wonder if on reflection you might conclude that the original, while a few words longer, has the advantage of clarity?
Leave this one with you, anyway... ---Soundofmusicals (talk) 01:26, 22 May 2019 (UTC)
Undo to UTF-16
[edit]Hi there! Regarding your edit here https://en.wikipedia.org/w/index.php?title=UTF-16&oldid=prev&diff=902560464, the RFC says "SHOULD", which in the language of RFCs explicitly means a recommendation. In section 1.2 of the mentioned RFC, it says 'The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119', and RFC 2119 says '3. SHOULD: This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.' Do you disagree that the "plain English" meaning is closer to recommend? DimeCadmium (talk) 06:32, 30 June 2019 (UTC)
re ®
[edit]For Registered trademark symbol: Please please go to the talkpage. You know how we do things at enwiki. -DePiep (talk) 22:28, 17 July 2019 (UTC)
Copyright symbol
[edit]Do you have a cite for the "many fonts draw the copyright symbol as a superscript" you recently added in "Copyright symbol"? I've been doing a clean-up and that's the last unsourced bit that needs attention. I attempted to ping you via edit summary, but I'm not sure that shows up as a notification. TJRC (talk) 14:55, 19 July 2019 (UTC)
- No, I copied the information from the Enclosed R page, I also copied it to the registered trademark symbol article. It had no citation, though it does appear true in a few fonts such as the serif and sans serif ones in the browser. An explanation as to why Unicode did not reuse it when they really like to reuse as much as possible is needed. I deleted uncited and imho dubious text that circled C is used instead of copyright because the copyright is missing from some fonts (any such font would be missing the circled C as well!)Spitzak (talk) 20:59, 19 July 2019 (UTC)
- The enclosed alphanumerics only references the entire Unicode standard when talking about the disunification.Spitzak (talk) 21:06, 19 July 2019 (UTC)
- Thanks, I'll delete that bit then; it's not all that important to what the symbol is, anyway. Thanks for deleting the Korean bit; that's bothered me for a long time, and I never got around to trimming it. Most of the edit adding it was pretty much nonsense, and in retrospect, I should have cut the entire paragraph rather than just editing out the obvious falsities. TJRC (talk) 22:39, 19 July 2019 (UTC)
Pound sign
[edit]I have undone your reversion of reference to # in lead. Article is "pound sign", not "£" or pound sterling. For US readers, "pound sign" means "#". Per WP:LEAD, significant points in the body should be summarised in the lead. If you continue to disagree, please use talk:Pound sign to discuss. --John Maynard Friedman (talk) 11:55, 24 September 2019 (UTC)
To say I am fed up with this article is an understatement - but last time I cut it from my watchlist the result was not good (!) In the meantime I have better things to do than field endless quibbles over precise wording. Current form of the article I can live with - so perhaps we could leave it there - at least until it gets attacked again. --Soundofmusicals (talk) 22:27, 18 November 2019 (UTC)
Sidebar at ordinal indicator
[edit]For the currency signs, I was wp:BEBOLD and just commented out the punctuation sidebar. I felt strongly that it was just clutter in those articles and in some, like rouble sign, it had the horrible disruptive effect that you describe: compare the versions before and after I edited it today. Maybe you don't actually need it in ordinal indicator? --John Maynard Friedman (talk) 20:39, 20 November 2019 (UTC)
Google Code-In 2019 is coming - please mentor some documentation tasks!
[edit]Hello,
Google Code-In, Google-organized contest in which the Wikimedia Foundation participates, starts in a few weeks. This contest is about taking high school students into the world of opensource. I'm sending you this message because you recently edited a documentation page at the English Wikipedia.
I would like to ask you to take part in Google Code-In as a mentor. That would mean to prepare at least one task (it can be documentation related, or something else - the other categories are Code, Design, Quality Assurance and Outreach) for the participants, and help the student to complete it. Please sign up at the contest page and send us your Google account address to google-code-in-admins@lists.wikimedia.org, so we can invite you in!
From my own experience, Google Code-In can be fun, you can make several new friends, attract new people to your wiki and make them part of your community.
If you have any questions, please let us know at google-code-in-admins@lists.wikimedia.org.
Thank you!
--User:Martin Urbanec (talk) 21:58, 23 November 2019 (UTC)
Removal of *sidebars from punctuation pages
[edit]Hey, I noticed that you removed navboxes from (all?) pages on punctuation on December 11th. Just wondering why? It's been reversed on Full stop with no apparent attempt from you to counter it, so I'm unsure whether it's supposed to be that way or not. - Novelyst (talk) 10:45, 19 December 2019 (UTC)
- I was replacing the "sidebar" with a new "navbox" at the bottom. The sidebar is far too big and interferes with images. I didn't revert because I wanted to see if there was consensus the navbox was better.Spitzak (talk) 15:54, 19 December 2019 (UTC)
- Ah, I understand. From what I can see, however, the navbox does not appear to display all punctuation (as did the sidebars) - I see no full stop - or the exact names adjacent. From a more objective standpoint, wasn't the sidebar better in this regard? - Novelyst (talk) 14:30, 20 December 2019 (UTC)
- Full stop is there after the flerion. The design was copied from the Currency symbols navbox, you can see the names by looking at the popup tooltip or link preview. However it could be altered to have text instead, or more likely both text and symbols. I still feel it would be good to get rid of the sidebar, on several articles it is serioulsy interfering with legibility because it forces the images out of alignment with text, also it seems that directions to other pages should be in navboxes.Spitzak (talk) 18:18, 20 December 2019 (UTC)
- Ah, I understand. From what I can see, however, the navbox does not appear to display all punctuation (as did the sidebars) - I see no full stop - or the exact names adjacent. From a more objective standpoint, wasn't the sidebar better in this regard? - Novelyst (talk) 14:30, 20 December 2019 (UTC)
Broken bar
[edit]In your edit to vertical bar, you say that ⇧ Shift+\ produces a solid vertical bar even though the engraving has a broken bar. I suspect that this behaviour is OS dependent. So maybe best you qualify your caption with a note to say which OS? (I have UK extended, not US-int). Your call. --John Maynard Friedman (talk) 11:32, 16 February 2020 (UTC)
'
[edit]If you are going to revert six edits in a row [1], [2], [3], [4], [5], [6], it would be appreciated if you would provide an explanation for at least one of them. --R'n'B (call me Russ) 00:39, 14 April 2020 (UTC)
- I was trying to fix the page it directed to. The page about straight apostrophe is completely unhelpful in explaining why this redirect is needed.Spitzak (talk) 03:30, 14 April 2020 (UTC)
- I'm not seeing anything there that could not be merged into Apostrophe#Typographic form. I would therefore disagree that Apostrophe is completely unhelpful. It is somewhat helpful and could be made more so. BD2412 T 04:08, 14 April 2020 (UTC)
- Obvious problems with ' (disambiguation) for describing ’:
- The words "straight version" in the apostrophe
- The words "straight version" in the single quotes
- Implication it can be used for more than the closing quote.
- It certainly is not a "modifier letter left half ring"
- Wrong for Okina
- Wrong for Satillo
- Wrong for stress
- Wrong for prime, foot, minute, minute of arc
I agree the interesting information could be put in the Apostrophe section you recommend, but the redirect has to go there and it has to show both meanings (at least for fonts with both ‘ and ’ they intended the character to be used as a quotation mark so it has to show that). Spitzak (talk) 04:17, 14 April 2020 (UTC) PS: adding " (disambiguation)" to links that happen to go to disambiguation pages is counter-productive, as it means if anybody ever edits the disambiguation page into a subject it won't fix these links.Spitzak (talk) 04:22, 14 April 2020 (UTC)
- Yes, a section redirect is possible. With respect to adding "(disambiguation)" to intentional disambiguation links, this is required by WP:INTDABLINK to remove them from the queue of errors needing to be fixed. On the rare occasion that a disambiguation page is converted to a regular article, these are also routinely fixed by disambiguators. We have been at this a very long time. BD2412 T 17:16, 14 April 2020 (UTC)
Talk:List of typographical symbols
[edit]Could you have a glance at Talk:List of typographical symbols to see if you agree with the direction I'm taking? --John Maynard Friedman (talk) 09:58, 25 April 2020 (UTC)
Common interests/focuses in/on character sets and code pages
[edit]It seems that we have very similar focuses/interests (at least recently) on Wikipedia. If you want to propose any online article collaboration on character sets, code pages, or encodings with me, I will consider it. Hkbusfan (talk) 10:36, 26 April 2020 (UTC)
- If you are interested and have time, please see my addition to the talk page on Code page 437. Hkbusfan (talk) 11:13, 26 April 2020 (UTC)
UTF-1 Revision
[edit]Hello. I undid your revision to the UTF-1 article. I've explained my reasoning for doing so on that article's talk page. Thanks. — Preceding unsigned comment added by Maschinengott (talk • contribs) 20:45, 29 April 2020 (UTC)
Danish keymap and AltGr
[edit]The Danish keymap was always under the X-windows section, like the Swedish keymap. It was I who tagged as 'clarification needed, reason=which OS?' until today when I realised that it should have been obvious from the nesting that it is X-windows. So I have reverted your change since (guessing that you are not Danish) neither of us can guess the intentions of the editor who first put it under x-windows. If it bothers you, I know a Danish editor who might be able to clarify if you want? --John Maynard Friedman (talk) 23:09, 13 May 2020 (UTC)
- It appears to exactly match the description of key combinations in the Windows section, and I do know that Linux tends to have a lot more AltGr combinations. I suspect it is the Windows keymap, but it is also redundant as that is already described. So maybe it should just be deleted.Spitzak (talk) 04:38, 14 May 2020 (UTC)
- Yes, I shall delete it as uncited and unclear. --John Maynard Friedman (talk) 09:38, 14 May 2020 (UTC)
- Is it is time to continue this discussion at [[template talk:char}}?
- How easy is it would it be to add something like |size= big |style= bold ? But honestly it would be a lot easier if we could use the regular markup. --John Maynard Friedman (talk) 08:03, 19 May 2020 (UTC)
Highlighting small symbols using template:code
[edit]I have been using {{tl:code}} to highlight tiny symbols, especially any (like straight apostrophe) that could be confused as markup or just plain overlooked: it makes it clearer that this is the symbol being described, it is not just part of the text. So I wondered why you removed it from guillemet: is it just a subjective taste thing?
(BTW, this template is different from {{tl:mono}}, which I tried in another context and didn't find at all helpful. Compare '
with ' and just plain ' ).--John Maynard Friedman (talk) 09:38, 14 May 2020 (UTC)
- That template also uses a monospace font, so it is pretty bad for a lot of typographical characters, as they tend to shrink, compress, or expand them to be fixed-width. If there is another way to put the character in a box it might be useful.Spitzak (talk) 18:43, 14 May 2020 (UTC)
- Ah, ok, now I understand your edit note. Right now, though, I really believe that we need the highlight so I will revert per WP:BRD pending a better solution because the cure is worse than the disease for most readers. I do understand and sympathise with your point though, so will search for a functionally equivalent template that doesn't have this annoying side effect. --John Maynard Friedman (talk) 18:51, 14 May 2020 (UTC)
- Please continue discussion at talk:Guillemet --John Maynard Friedman (talk) 18:55, 14 May 2020 (UTC)
This seems to work, I can't find any way to get the box other than the code command: <code>{{serif|fooiMMM}}</code> -> fooiMMM
I would get rid of all the parenthesis, quotes, angle brackets, and other stuff people have used. Even in this example there are now unnecessary parenthesis.
- Yes, I agree. I was editing typewriter yesterday and they really are redundant. But I had to use {{code}} until we come up with something better.
- I've been looking at Template:Navbox punctuation which has the symbols against a pale grey background. The actual formatting is done at Template:Navbox punctuation/set which maybe you could raid for inspiration? Is there anything in template:Semantic markup templates we could use (nothing leaps off the page at me). --John Maynard Friedman (talk) 10:39, 16 May 2020 (UTC)
- I think it might be ok to make template:char and make it do the code+serif hack shown here. Then pages can be edited to use it, and it can be changed later.Spitzak (talk) 18:42, 16 May 2020 (UTC)
- In principle, I agree but what makes me hesitate is not understanding the way wp uses CSS. I also don't know whether there are any usability issues if we override the readers font choice. (As indeed mono does already). But meantime I can't see anyone objecting to you building the char template and there is something definite for people to comment on. --John Maynard Friedman (talk) 20:17, 16 May 2020 (UTC)
- I don't think making a template is any worse than putting the code directly in the page, and the template makes it much easier to fix in the future if it is done wrong.
- Another one to look at is template:keycap it draws a box and does not mess with the font.Spitzak (talk) 20:58, 16 May 2020 (UTC)
- A template is definitely the way to go, no question.
I knew about {{keypress}} but not about keycap. The output seems similar: ° v °.The only concern I have about just usingthemit 'as is' is the shadow effect that aims to give an "artist's impression" of a raised key. So it is 'just' a question of copying and hacking the code of keycap and we have our {{char}}. Can you do it?- [Sentence struck out because keycap redirects to keypress].
- Another advantage of using a template is that maybe someone in the future could develop it to support an argument like "font=" or "style=" or even "size=" or "colour=". But let's get basic version done first! --John Maynard Friedman (talk) 17:03, 17 May 2020 (UTC)
- The Lua code that tl|keypress uses is at Module:Key. It looks like the box-shadow code is near the top. But I have no idea whether the average editor can create a new Module:Char (the same as Key but without the shadow)? --John Maynard Friedman (talk) 17:15, 17 May 2020 (UTC)
- A template is definitely the way to go, no question.
- In principle, I agree but what makes me hesitate is not understanding the way wp uses CSS. I also don't know whether there are any usability issues if we override the readers font choice. (As indeed mono does already). But meantime I can't see anyone objecting to you building the char template and there is something definite for people to comment on. --John Maynard Friedman (talk) 20:17, 16 May 2020 (UTC)
- I think it might be ok to make template:char and make it do the code+serif hack shown here. Then pages can be edited to use it, and it can be changed later.Spitzak (talk) 18:42, 16 May 2020 (UTC)
Specification for template:char
[edit]Just so as to be clear, compare these
- ° {{highlight|°}}
°
{{code|°}}- ° {{keypress|°}}
Do you agree that we don't want any kind of hard box around the character?, that {{code}} is closest. But I notice that on the mobile version, {{code}} has a very thinly drawn box. Even so, it is acceptable to my eye. --John Maynard Friedman (talk) 21:36, 17 May 2020 (UTC)
- I like the code one the best. It would be nice if the extra space it adds was not there, though. Another place to look is the "buttons" at the bottom of this text editor, they look pretty good. Sorry I have no idea how to decipher the code behind the keypress template.Spitzak (talk) 16:08, 18 May 2020 (UTC)
- Whoa! I found out the lowest-level code by examining those buttons. Here is is:
- <span style="border: 1px solid #ddd; background-color: #f9f9f9; padding: 1px 4px">°</span> -> °
- Spitzak (talk) 16:24, 18 May 2020 (UTC)
- Fantastic! So do you want to give creation of {{char}} a try,? --John Maynard Friedman (talk) 19:40, 18 May 2020 (UTC)
- (BTW, maybe you already did this but I can confirm that it displays as expected on both desktop and mobile wikis). --John Maynard Friedman (talk) 19:53, 18 May 2020 (UTC)
- I made template:char and put a single usage on guillemet. Go for it, feel free to modify and add more uses! Spitzak (talk) 20:11, 18 May 2020 (UTC)
![]() | What a Brilliant Idea Barnstar | |
By Jove, I do believe she's got it! John Maynard Friedman (talk) 20:48, 18 May 2020 (UTC) |
Some niggles to resolve
[edit]- In some articles (for example, Equals sign), the symbol is given in bold because that is the standard for redirect targets. '''{{char|°}}''' doesn't do that but '''{{code|°}}''' does. (° v
°
). - In some articles, for example, Equals sign again, the symbol is given larger size (viz., '''{{Big|1==}}'''). Actually I don't think it is necessary in that particular case but there are some tiny glyphs that do need it. I don't understand why you would want to deprecate <big>{{char|°}}</big> but I see it works:
°
--John Maynard Friedman (talk) 21:15, 18 May 2020 (UTC)
- Sorry, I've re-read the template doc and you do explain why you don't like it. But this behaviour is not unique to your template:
- The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. ° The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. ° The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.The quick brown fox jumped over the lazy dog. The quick brown fox jumped over the lazy dog.
- I think we just have to accept it. It really isn't that offensive. --John Maynard Friedman (talk) 21:58, 18 May 2020 (UTC)
- It looks like <big> does not change the line spacing. I was trying to set font-size in the <span>. So if big works, use it put it in the template. Bold might be a good idea too, but I think you got bold because <code> thought it was a parser token.Spitzak (talk) 00:16, 19 May 2020 (UTC)
Sandbox
[edit]I created a template:char/sandbox trying to make a bold on/off. It hasn't worked (see the test-cases page). I don't know of another template that has conditional styling like we need, that I could learn from. Do you? --John Maynard Friedman (talk) 10:24, 19 May 2020 (UTC)
- The other way round it of course is to create a template:charb that is the same as char but with bold on top? Maybe if someone in the future comes up with a Cunning Plan that resolves the issue, it will be easy to redirect. --John Maynard Friedman (talk) 15:01, 19 May 2020 (UTC)
- I don't see any easier way to show the user a choice other than just having them put the quotes around the character to make it bold. However why not just make *all* the characters bold? I think the reason equals sign was bold was becuase of code's syntax highlighting.Spitzak (talk) 16:27, 19 May 2020 (UTC)
- The 'usual' (!) way is to put three straight apostrophes on either side of the template call. (Interestingly, a question was raised at template talk:code querying why the template doc says it doesn't work when it does). '''{{char|@}}'''
@ doesn't go bold and I was really surprised by {{char|'''@'''}} @ because I thought it would just produce a string like abc@xyz, but it neither does that nor go bold. Well, it wouldn't, the span syntax determines the presentation. - The bold = is because the article is about 'equal to' and if you search with an equals symbol (=) then you will be redirected to that article and 'house style' says that redirect targets must be bold.
Which is precisely why I want {{charb}} or {{char|@|bold=}}. I'm warming to the idea of {{charb}}, there are certainly precedents for template variants for awkward cases and having it means we can get the changes underway. (I've already done Bracket). Unless you object, I will go ahead and create it.--John Maynard Friedman (talk) 19:11, 19 May 2020 (UTC)- Triple quotes inside the template worked for me, but it is probably ok to make a template:charb.Spitzak (talk) 19:23, 19 May 2020 (UTC)
- Yes, I've just seen your sandbox tests and I'm beginning to doubt my sanity! I am convinced that it wasn't bolding, but there is no doubt now that it does, right there in the text I wrote above. I had better go lie down in a darkened room for a while. Sorry for wasting your time chasing fairies. Charb is dead, long live char! --John Maynard Friedman (talk) 19:56, 19 May 2020 (UTC)
- Triple quotes inside the template worked for me, but it is probably ok to make a template:charb.Spitzak (talk) 19:23, 19 May 2020 (UTC)
- The 'usual' (!) way is to put three straight apostrophes on either side of the template call. (Interestingly, a question was raised at template talk:code querying why the template doc says it doesn't work when it does). '''{{char|@}}'''
- I don't see any easier way to show the user a choice other than just having them put the quotes around the character to make it bold. However why not just make *all* the characters bold? I think the reason equals sign was bold was becuase of code's syntax highlighting.Spitzak (talk) 16:27, 19 May 2020 (UTC)
Someone doesn't like char
[edit]See template talk:char#I have doubts about this template. --John Maynard Friedman (talk) 23:29, 14 June 2020 (UTC)
Nomination for deletion of Template:Char
[edit]Template:Char has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Psiĥedelisto (talk • contribs) please always ping! 05:59, 6 July 2020 (UTC)
- In case you missed it, see this diff. So it is no longer disputed that WP:STATUSQUO means (as it should have meant) that you can re-instate the template as it was, pending an MOS debate.
- The concern about screen-readers remains a serious one so it is not ever so obvious what is the best thing to do. The template is useful to sighted and partly-sighted visitors but possibly counter-productive to blind visitors. The answer may boil down to tactics: if we had a convincing argument drafted and nearly ready to deliver for debate, then it would be less likely to be deleted again. The risk is that someone else will make an adversarial proposal in the meantime that is difficult to counter when we don't have a solution to this show-stopper. --John Maynard Friedman (talk) 09:50, 19 July 2020 (UTC)
- @John Maynard Friedman and Spitzak: Indeed—I won't revert an edit that restores it. If you restore it, though, you should have a MOS discussion ready to settle the consensus issue once and for all...in my opinion. I'll also probably remove it from more places as I've done to section sign, pilcrow, and numero sign so far as I come across them, which after some thought and consideration of what VanIsaac wrote me on their talk page, actually seems like the correct play as even I think the template can be useful for small punctuation/combining marks...I guess what y'all need to do is figure out what this template's really for, what problem it solves, what problems it's not meant to solve.
Best, Psiĥedelisto (talk • contribs) please always ping! 12:10, 19 July 2020 (UTC)
- @John Maynard Friedman and Spitzak: Indeed—I won't revert an edit that restores it. If you restore it, though, you should have a MOS discussion ready to settle the consensus issue once and for all...in my opinion. I'll also probably remove it from more places as I've done to section sign, pilcrow, and numero sign so far as I come across them, which after some thought and consideration of what VanIsaac wrote me on their talk page, actually seems like the correct play as even I think the template can be useful for small punctuation/combining marks...I guess what y'all need to do is figure out what this template's really for, what problem it solves, what problems it's not meant to solve.
A barnstar for you!
[edit]![]() | The Civility Barnstar |
I very much appreciate how willing you've been to see changes made to Template:Char, and how you've kept your cool despite how heated that discussion has gotten. This, along with User talk:John Maynard Friedman § Dedicated to you, is my attempt to lower the temperature a bit and reach out with a bit of an olive branch. Psiĥedelisto (talk • contribs) please always ping! 22:29, 8 July 2020 (UTC) |
Trip hazard warning
[edit]Anticipating potholes before going to talk:MOS might be wise. The only observation about {{char}} (as it was) that really stopped me in my tracks was the one by SMcCandlish: screen-readers for blind users don't cope well with span style. They gave me a succinct summary which will have to be taken on board.
Broadening their advice a bit, if the revised template uses standard html markup as much as possible, it should be safe to assume that screen readers have been programmed to deal with those, just not CSS or embedded styles. I've been doing some experimenting, so here they are to save you some work:
- {{keypress}}: © ©
- {{char}}: © ©
- Original char: ©
- {{code}}:
©
©
is monospaced so a squeezed oval - {{samp}}: © the © is also monospaced
- with font var: © the [No effect so maybe I've not coded this correctly].
- {{para}}:
|© the=
|©=
I'm willing to help in the background. --John Maynard Friedman (talk) 20:47, 17 July 2020 (UTC)
- I have just been reminded of WP:VPT, where maybe someone will have some suggestions or even solutions? --John Maynard Friedman (talk) 10:09, 18 July 2020 (UTC)
Multiplication sign
[edit]I guess that it would be a bit cheeky for me to propose your idea, so you are welcome to open the RFC if you prefer?
I assume what you had in mind was something like this:
What do you think? --John Maynard Friedman (talk) 21:58, 6 August 2020 (UTC)
- Looks good to me but I would make it look like this:
- It is not only the Unicode Consortium who calls this a multiplication sign.Spitzak (talk) 22:22, 6 August 2020 (UTC)
- I suppose I was trying to find a way to justify WP attaching the name "multiplication sign" to that particular glyph. In Germany, it is not the multiplication sign (they use dot operator). I suppose we can cite wp:common name (in English). I was trying to learn from currency sign and currency symbol. You are probably right, your version is more likely to achieve consensus. Anyway, it would be civilised to wait until late August to propose it.
- (By the way, could you try to use {{rto}} to me, rather than hope I will notice that a change to your talk page is relevant to me).--John Maynard Friedman (talk) 23:50, 6 August 2020 (UTC)
UTF-8
[edit]Why did you remove this? It's not "unnecessary complexity" and it might not be in referenced sources, but it's a simple fact that naturally follows from previous explanations. Even the text says If the number of significant bits is no more than seven, the first line applies; if no more than 11 bits, the second line applies, and so on.. I just quantified it in the table, so readers like me who wanted to understand UTF-8 wouldn't have to count them manually. It seems to me as if you reverted just because I don't have an userpage and therefore am less experienced. Eksekk (talk) 10:44, 11 August 2020 (UTC)
- It is unnecessary and redundant (same as number of x's), not used to actually implement UTF-8 (comparing to max and min values is much easier), in the wrong column (it is an input so it should be on the left), has way to large of a title, and mostly because it makes UTF-8 look more complicated than it is.Spitzak (talk) 16:37, 11 August 2020 (UTC)
- Using the same logic, you could remove number of bytes, as it's already shown more in depth by individual byte contents. No one is going to prefer counting x's themselves instead of having a flat out number.
- I have tried to remove the number of bytes but it got reverted, and it actually is that way in the reference.Spitzak (talk) 19:24, 11 August 2020 (UTC)
- It's meant to enhance information, not replace it.
- I don't think so, and even if you insist, it could be moved instead of reverted.
- Then please think of better one, I tried my best. Maybe "bit count" and reference explaining it? Also a line break can be inserted to split it into two rows, probably the best choice.
- Older versions said "bits"Spitzak (talk) 19:24, 11 August 2020 (UTC)
- That' too general. The best I can think of is "Significant bit count" split in two lines. Eksekk (talk) 19:34, 11 August 2020 (UTC)
- Older versions said "bits"Spitzak (talk) 19:24, 11 August 2020 (UTC)
- No it doesn't, it makes sense of all bytes beginning with 10s, which threw me off.
- Sorry I don't see how this clarifies anything at all for what the 10 means.Spitzak (talk) 19:24, 11 August 2020 (UTC)
- Should have posted more clearly. I was confused why there are bytes beginning with 10 and how that encoding actually works, then noticed a thing about 7/11 bits and deduced that these x's must be replaced with actual content. I think direct statement in the table might be helpful for understanding. Eksekk (talk) 19:34, 11 August 2020 (UTC)
- Sorry I don't see how this clarifies anything at all for what the 10 means.Spitzak (talk) 19:24, 11 August 2020 (UTC)
- Using the same logic, you could remove number of bytes, as it's already shown more in depth by individual byte contents. No one is going to prefer counting x's themselves instead of having a flat out number.
- Eksekk (talk) 18:59, 11 August 2020 (UTC)
- Letting you know that I posted on WP:3O. Eksekk (talk) 19:24, 11 August 2020 (UTC)
- Oh and by the way sorry for accusing you of bad faith, I've seen now that you have managed articles about encodings for a long time. Eksekk (talk) 19:35, 11 August 2020 (UTC)
- If you really think this is important, look in the history, there were shorter readable methods. This has been added and removed from the table many times. My best guess is that it be called "bits" and placed after the column for maximum number. Also you will need to figure out if it makes sense to add a "bits" column to the matching tables in the "history" section.Spitzak (talk) 19:38, 11 August 2020 (UTC)
- I stumbled across this at Wikipedia:Third opinion. FWIW, I think that in most contexts adding clearly redundant information makes interpretation more difficult, not easier. This may due to two things: a cluttering effect and the implication by its inclusion that it is not redundant (the latter making the reader spend the effort of figuring out that it is only redundant information). The more subtle aspects of the encoding, such as whether longer encodings of a value are valid, (i.e. the minimum number of "significant bits" for a given byte length) is answered concisely and clearly in the first three columns, but not by the removed addition. So: for improved clarity of presentation, I would support Spitzak's perspective. —Quondum 14:26, 14 August 2020 (UTC)
- Thank you, that is exactly what I think but you explained it clearly. It is not clear to the user that information is redundant, so they waste time or are confused trying to figure out what additional information is being provided.Spitzak (talk) 16:22, 14 August 2020 (UTC)
- Would it be worth removing the request from the WP:3O page. It is still there, but I see a third opinion has already been provided. Maidyouneed (talk) 06:28, 19 August 2020 (UTC)
- Thank you, that is exactly what I think but you explained it clearly. It is not clear to the user that information is redundant, so they waste time or are confused trying to figure out what additional information is being provided.Spitzak (talk) 16:22, 14 August 2020 (UTC)
- I stumbled across this at Wikipedia:Third opinion. FWIW, I think that in most contexts adding clearly redundant information makes interpretation more difficult, not easier. This may due to two things: a cluttering effect and the implication by its inclusion that it is not redundant (the latter making the reader spend the effort of figuring out that it is only redundant information). The more subtle aspects of the encoding, such as whether longer encodings of a value are valid, (i.e. the minimum number of "significant bits" for a given byte length) is answered concisely and clearly in the first three columns, but not by the removed addition. So: for improved clarity of presentation, I would support Spitzak's perspective. —Quondum 14:26, 14 August 2020 (UTC)
Numeric input
[edit]It is probably for the best if I don't reply at talk:Unicode input, it will only mess up the discussion thread.
Windows certainly does have a track record of using binary values that are in the 'reserved for control codes' range, notably the € sign and curly quotes. It absolutely did not use the correct unicode code point (maybe it does now? I don't use Windows anymore). When a file with one or more of those code-points was sent to a Mac or Linux user, the result was not printable: typically it was just ignored. Peter argues that it doesn't arise: the user will either use Alt+(the decimal equivalent of x22) for straight quotations or Alt+(the decimal equivalent of x201C) for left curly quote, autocorrect doesn't arise. So what happens when (with code page 1252) the user enters Alt+0128 to get a € sign? Will the resulting file contain x80 or x20AC? Why? (or why not?).
We know from our Japanese friend's experience at talk:Pound sign#Explanation for Alt keycode 6556?, Alt+0nnn does not generate the uncode code-point for nnn.
This why I suggested that you guys first get clear on terminology because Microsoft has been playing fast and lose with this stuff and you stand on a quicksand. --John Maynard Friedman (talk) 19:25, 22 September 2020 (UTC)
Underscore, underline
[edit]I have in mind to do a wp:merge on underscore and underline since they seem to cover more or less the same thing. Before I go to the effort, can you see any redeeming features? I don't want to waste time on a pointless exercise. Do you think does it need a WP:MERGEPROP? is it at all controversial? --John Maynard Friedman (talk) 18:55, 28 September 2020 (UTC)
- I assume the result will be underline? This makes sense, the character on the computer keyboards is based on a typewriter key who's purpose was to add underlines to already-typed text. The new article would talk about underlines first, then talk about the character, how it no longer works to underline text, but that many other uses were found for it.Spitzak (talk) 20:13, 28 September 2020 (UTC)
- That might run into an ENGVAR issue? Is 'underscore' an American usage? --John Maynard Friedman (talk) 00:00, 29 September 2020 (UTC)
- I think "underscore" is the character "_". If a typewriter user overprints a lot of them on some other letters, the result is an "underline".Spitzak (talk) 01:36, 29 September 2020 (UTC)
- Sorry, unconvinced. One letter can be underlined. The UC calls it 'low line', so I can't even hide behind them. Wiktionary defines 'underscore' by reference to underline, but not conversely. I think it will have to be a MERGEPROP. If I do it. --John Maynard Friedman (talk) 07:47, 29 September 2020 (UTC)
- I think "underscore" is the character "_". If a typewriter user overprints a lot of them on some other letters, the result is an "underline".Spitzak (talk) 01:36, 29 September 2020 (UTC)
- That might run into an ENGVAR issue? Is 'underscore' an American usage? --John Maynard Friedman (talk) 00:00, 29 September 2020 (UTC)
Google's Ngram viewer has underscore the clear winner: Books Ngram Viewer: underscore, underline, so wp:common name. --John Maynard Friedman (talk) 09:19, 29 September 2020 (UTC)
- You are right it is ok to call both the line under letters and the ASCII character "underscore", and if that is the more common term I would use it for the title.Spitzak (talk) 17:00, 29 September 2020 (UTC)
Clarification of my 3270 edit
[edit]I updated the reference to the 3270 as a thin client because the original text claimed that web clients were thin clients and I thought that referring to X terminal would be less contentious than deleting it entirely. The references to not uploading scripts was an attempt to avoid an edit war. I'm perfectly happy with deleting the sentence as long as it stays deleted and does not get changed to the original.
Stay healthy. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:31, 29 September 2020 (UTC)
Alt codes
[edit]Spizak, if you would think in terms of this article being for Mac and Unix users (who have little or no prior knowledge of MSDOS and Windows), it might make it more obvious why Peter and I are intent on clarifying everything. Your replies seem to assume we already know most of it and are being deliberately obtuse. We aren't. We are trying to tease out the details and I in particular am trying to dig out the key underlying principles that are so easy to lose if the article is written exclusively from the pov of an American user. When we have properly explained the experience of our Japanese friend, then we have succeeded. I do realise that MSDOS in particular never took any account of "international" users in its design, so trying to retrofit a logic that was never there in the first place is hard, but we really need to set it in a standards-based context because that is the only scaffolding that we may assume our readers already have.
So thank you for your help thus far and can accept having your patience tried a little longer. The article is already vastly improved from where it was, mostly because of your contributions. --John Maynard Friedman (talk) 22:00, 6 October 2020 (UTC)
You may have already noticed - but the "rules man" is at it again. I am SO fed up with this -which is probably all this troll really wants. Trying hard to be patient but what the [naughty word expunged]. I suspect you are even more fed up than me, but would appreciate your support in establishing a firm consensus here. --Soundofmusicals (talk) 22:24, 12 October 2020 (UTC)
- For anyone interested, I responded at User talk:Johnuniq#Roman numerals. Johnuniq (talk) 03:34, 13 October 2020 (UTC)
- I am sure you are up to date on this without further prompting from me - but we do need to keep this going! --Soundofmusicals (talk) 01:27, 20 October 2020 (UTC)
I know this is a horrible bore (no one more fed up than me) but a quick little one line comment from you is probably all we need to get on top of this nonsense! -Soundofmusicals (talk) 21:39, 31 October 2020 (UTC)
Sharp s
[edit]If you can spare a few minutes, I'd welcome you checking my edits to ß, please, as they were fairly extensive changes to the markup. See also talk:ß#Markup, where I've left an explanatory note. --John Maynard Friedman (talk) 18:54, 21 November 2020 (UTC)
Private Use Plane redirect
[edit]Private Use Plane redirects to Plane (Unicode)#Private Use Area planes, not to Private Use Areas
Private use plane is the one that redirects to Private Use Areas.
Note the difference in capitalizations. The hatnote has to match the redirect exactly. Since it doesn't, the article is in Category:Articles with redirect hatnotes needing review. I would think that both should redirect to the same place. I have since changed Private use plane to go to the same place as Private Use Plane and removed that hatnote again. MB 16:13, 20 December 2020 (UTC)
Backslash
[edit]Thanks for cleaning up the sloppy citations. I intended to use wp:REFILL but real life intervened. I don't think we can use the url=, per WP:COPY, see talk:backslash. --John Maynard Friedman (talk) 21:38, 4 January 2021 (UTC)
Tilde
[edit]Would you review tilde, please. I have cleaned it up after a well-meaning editor tried to 'improve' it but their knowledge of typography is worse than mine, which takes some doing. I may have been too focused on the detail, would you check that it still hangs together as a whole? --John Maynard Friedman (talk) 17:50, 8 February 2021 (UTC)
- I saw that, probably a good start but even now it is enormously repetitive, describing dead keys over and over again. Also it is completely wrong about ASCII, at least initially they certainly 100% intended the accent characters to be overprinted to produce accented letters. The fact that this was not going to work on most computer hardware was either ignored or not learned until too late. And all the accents quickly aquired other uses and the images grew and moved as the need to use them as overprinted accents disappeared.Spitzak (talk) 19:27, 8 February 2021 (UTC)
- I cannot tell a lie, it was I who wrote that backspace & overtype was technically obsolete. No wonder I couldn't find a citation. I've never seen a printer that could do that but maybe teletypes did?
- I deleted a load of repetitious waffle about dead keys, don'tont tell me that there was still more! --John Maynard Friedman (talk) 19:56, 8 February 2021 (UTC)
- No, I was referring to the earlier version. You did fix it a lot.Spitzak (talk) 19:58, 8 February 2021 (UTC)
- Do you have a citation that overprinting was in the original spec? The ñ character was encoded individually.
- I have just realised that there is another horrible error that I 'corrected' as in made credible: ASCII has no code points for accented letters. I'll have to go back and correct again. --John Maynard Friedman (talk) 20:10, 8 February 2021 (UTC)
- I believe the original reason the accent characters (and the underscore) appeared in character sets was to allow overprinting of accent marks. However it is true that by the time ASCII was being designed, they incorporated these primarily for compatability with previous sets and the inability to use them as accents was already well established. So this is just the original reason they were there. It is possible users of mechanical typewriters actually started typing the tilde and space to get squiggly dashes and that the main reason by the time computer sets were started was to replicate this.Spitzak (talk) 20:20, 8 February 2021 (UTC)
- I think that it is still wrong. The position of the 007E tilde on the vertical axis is a type designer's choice, there is no spec that I know of that says it has to be the same as the horizontal line of an e (or and E). --John Maynard Friedman (talk) 20:40, 8 February 2021 (UTC)
- Thanks, you got there first. I had just realised that ñ and Ñ were not in the original ASCII and went back to remove them. The tilde on the old ASCII chart at File:USASCII_code_chart.png looks like the CJK double width one! --John Maynard Friedman (talk) 20:53, 8 February 2021 (UTC)
- I believe the original reason the accent characters (and the underscore) appeared in character sets was to allow overprinting of accent marks. However it is true that by the time ASCII was being designed, they incorporated these primarily for compatability with previous sets and the inability to use them as accents was already well established. So this is just the original reason they were there. It is possible users of mechanical typewriters actually started typing the tilde and space to get squiggly dashes and that the main reason by the time computer sets were started was to replicate this.Spitzak (talk) 20:20, 8 February 2021 (UTC)
- I have just realised that there is another horrible error that I 'corrected' as in made credible: ASCII has no code points for accented letters. I'll have to go back and correct again. --John Maynard Friedman (talk) 20:10, 8 February 2021 (UTC)
- Do you have a citation that overprinting was in the original spec? The ñ character was encoded individually.
- No, I was referring to the earlier version. You did fix it a lot.Spitzak (talk) 19:58, 8 February 2021 (UTC)
Once more with feeling
[edit]I've rewritten Tilde#Keyboards so a review would be welcome given that I don't have a US keyboard to check it on. --John Maynard Friedman (talk) 23:36, 1 April 2022 (UTC)
- On a US keyboard the tilde is on a key at the upper-left corner to the left of the '1' key. Unshifted it has the backtick (`) and shifted is the tilde (~).Spitzak (talk) 00:48, 2 April 2022 (UTC)
Color banding revert
[edit]Why was my edit fixing a misspelling reverted?
- It said right at the top that the article uses British English.
Frass valley
[edit]Hi! You may be interested in Wikipedia:Redirects_for_discussion/Log/2021_June_22#Frass_valley. – Uanfala (talk) 21:19, 22 June 2021 (UTC)
Html codes in infobox at Full stop
[edit]I don't understand why you removed the |html= option from the infobox at full stop. It is useful info and all other punctuation infoboxes have it. Your edit note says nothing. So why? --John Maynard Friedman (talk) 22:15, 28 June 2021 (UTC)
- I think the information is useless since there are extremely few cases where the period itself cannot be used in the document. I also find the html shortcuts being part of the "Unicode" template an extremely distraction. They should be printed somewhere else, and stop putting the numerical versions in as the user can type #xNNN; using the hex Unicode number.Spitzak (talk) 22:48, 28 June 2021 (UTC)
- You would need to take that view to (a) template talk:infobox punctuation mark and (b) template talk:unichar. I didn't write these templates, they do what they do. Yes, it is optional whether or not to use the |html= option, but not how much of it you get if you do (and for that reason I often don't use it because it can be disproportionate and even undue). I don't care enough about it to push it: the real point I'm making is that your edit note didn't give a clue. --John Maynard Friedman (talk) 07:35, 29 June 2021 (UTC)
Chevrons
[edit]SNAP!!! Drat, you saw it first. I was just about to correct my error but too late. Thanks anyway. --John Maynard Friedman (talk) 19:41, 28 July 2021 (UTC)
Undecimal
[edit]Not likely to have been a joke, given Lagrange's concern with making things easy for the common Frenchman, as reflected in his comments as recorded in the lectures given at the École Normale. See the material cited on the Lagrange and undecimal pages.
CP437 Turkmen
[edit]Remember over a year ago we discussed how CP437 may have been the inspiration for some of the unusual characters in Turkmen? I actually found a potential source for the CP437 and/or latin scripts article: Factors Influencing the Success and Failure of Writing Reforms. Hkbusfan (talk) 10:19, 19 October 2021 (UTC)
- I'm also not quite sure how to cite it (the article can be searched for on Google). Hkbusfan (talk) 10:23, 19 October 2021 (UTC)
- I think the fact that CP437 influenced the design of a written language is pretty interesting, it would be nice to add that (probalby to both the CP437 and Turkmen articles).Spitzak (talk) 17:33, 19 October 2021 (UTC)
x005C
[edit]If you have the time and inclination, would you care to add your 2¢ worth at talk:Won sign#"Most Korean keyboards input 0x5C when the won sign key is pressed", please? I don't know enough about Microsoft code pages to contribute intelligently. --John Maynard Friedman (talk) 19:30, 30 October 2021 (UTC)
Reverted example of UTF-8 codepage layout
[edit]Hi Spitzak, I noticed you recently reverted my edits on UTF-8. I made my changes because when I first read this article I found it difficult to understand how the codepage layout works, and I was hoping future readers could benefit from an example, especially since the codepage is a valuable resource once you understand how to read it. Please consider allowing my edit to stand. Cheers, --RubberDuckDebugger (talk) 21:24, 5 November 2021 (UTC)
- @RubberDuckDebugger: I too have had difficulty making sense of how UTF8 works (UTF16 is even worse), so thought your text might at least make at least z useful footnote (using template:efn). But I'm afraid your text
For example, cell 9D says +1D. The hexadecimal number 9D in binary is 10011101, and since the 2 highest bits (10) are reserved for marking this as a continuation byte, the remaining 6 bits (011101) have a hexadecimal value of 1D. These characters never occur as the first byte of a multi-byte sequence.
- needs another sentence between "value of 1D" and "These characters". But talk:UTF-8 is a better place to discuss this.
- Spitzak, I thought "useless verbiage" rather excessive and RDD's response is remarkably restrained. It may be obvious to you but not to everyone. --John Maynard Friedman (talk) 00:00, 6 November 2021 (UTC)
- @John Maynard Friedman: Thanks for chiming in. A footnote sounds like the right place for my example and it looks like a Notes section already exists with similar content. I'll go ahead and move my example down there. --RubberDuckDebugger (talk) 03:03, 6 November 2021 (UTC)
- If this is really felt to be necessary I guess it is ok. I am just worried that the page is getting quite redundant. There are six x's in the table at the start which IMHO make it perfectly clear where the bits are. The character table was originally to show what bytes were allowed and disallowed, it has somehow bloated into quite a monster.Spitzak (talk) 00:42, 7 November 2021 (UTC)
Your 'home' page
[edit]Did you notice that someone has given you a home page? (They probably intended to put it here on your talk page.) --John Maynard Friedman (talk) 19:07, 17 November 2021 (UTC)
Yep, moved it here: I don't have any idea how to go back to the "no home page" state however.
- Nor do I. But in your case, the obvious response is to put a box in the middle saying "This page has intentionally been left blank", á la IBM manuals. :-D --John Maynard Friedman (talk) 21:37, 17 November 2021 (UTC)
Hexadecimal
[edit]Thanks for your edits to hexadecimal. This article includes much redundant content and my addition of the x' format was already included later.
As to "common":
In an article dated 6 May 2017 " About 95 percent of ATM swipes use COBOL code, Reuters reported in April" 80 percent of in-person transactions. In fact, Reuters calculates that there’s still 220 billion lines of COBOL code currently being used in production today, and that every day, COBOL systems handle $3 trillion in commerce.
The "Second Edition (August 2009)" of the COBOL reference manual , p27 "Hexadecimal notation for alphanumeric literals" X"hexadecimal-digits" X'hexadecimal-digits' DGerman (talk) 22:53, 30 October 2021 (UTC)
Bot changing Ordinal indicator
[edit]FYI, it is a faulty bot. The {{bot|deny=}} should have been enough to tell it to go away. It has been blocked until the author fixes it. See User talk:Qwerfjkl#bot problem?. --John Maynard Friedman (talk) 21:33, 17 November 2021 (UTC)
Your Removal of Information from Windows-1252 and Mac OS Roman
[edit]Dear User Spitzak: Page History shows that you recently (November 13 and 15) removed a great amount of visible information from the pages Windows-1252 and Mac OS Roman and numerous other pages. Notably, each cell in each table is now missing its Unicode code point and its former background coloration (indicating character class). The reason you cited in your edit summary was "consistency" (translation: "another article gives less information"); consistency is not generally an adequate reason for data deletion. If your edits were made as part of a group decision with community consensus, please add an audit trail (e.g. a link to the discussion) in the talk page and/or revision history of each affected article. 67.0.48.42 (talk) 22:31, 23 November 2021 (UTC)
- Enormous amounts of information was *ADDED* to the tables, not removed. The Unicode code point, and the NAME of the code point, are in the tooltip.Spitzak (talk) 23:56, 23 November 2021 (UTC)
- Yes the "colors" were removed. This allowed color highlighting to be used for *useful* purposes, rather than the boxes or checkerboard overlays that were forced before.Spitzak (talk) 23:59, 23 November 2021 (UTC)
- Just FYI, the tooltip doesn't appear on mobile web access (to MsDos extensions table) using Android+Chrome. I don't know that this especially matters or even how to fix it if it does, short of another table. --John Maynard Friedman (talk) 12:38, 24 November 2021 (UTC)
- People working on the Unicode code charts have experimented with ways of changing the text in a small attached area to the table to show information about the most-recently clicked cell. This has a lot of problems (for me their test always scrolled the table to the top of the window, and it certainly interferes with any ability to click on the cell to go anywhere). But it is being looked into. For now tooltips seem the best way, they can present a lot more data (ie the name of the Unicode code point) and they do not clutter the basic display. I also find it unlikely a mobile user is going to be able to use this information, especially the Alt codes, in any useful way.Spitzak (talk) 17:26, 24 November 2021 (UTC)
- Just FYI, the tooltip doesn't appear on mobile web access (to MsDos extensions table) using Android+Chrome. I don't know that this especially matters or even how to fix it if it does, short of another table. --John Maynard Friedman (talk) 12:38, 24 November 2021 (UTC)
- On the original question: this was left up for weeks on the ASCII page with no comments ever posted. I also worked on the Unicode code pages and did get a comment that they wanted the dotted boxes preserved, as well as the cell sizes, both of which I did, and duplicated for these tables.Spitzak (talk) 17:28, 24 November 2021 (UTC)
Code page tables on Wikipedia no longer show Unicode code point equivalents contrary to the pages' claims
[edit]Dear Spitzak: the Wikipedia pages for code pages state "The following table shows (code page). Each character is shown with its Unicode equivalent." and "...shown with its Unicode code point.". This sentence is placed right in front of the tables. However, the Unicode code points are no longer (immediately) visible. Removing the Unicode equivalents from the code tables does not serve any purpose. Worse, it invalidates the Wikipedia pages' current content. In addition, the color-coded character class identifications were removed, which served useful categorical information.
Developers and enthusiasts rely on the Unicode equivalents U+xxxx listed in the code page tables, which are now no longer directly visible. I strongly argue that the Unicode U+xxxx equivalents should be rendered visually in the tables, in addition to the (linked) character graphic and the character class color coding. Removing the U+xxxx is bad for developers who need to map code page character definitions to Unicode, e.g. to update legacy software. There are plenty of non-Wikipedia resources that show code page tables with Unicode code point equivalents. Because of this change, I'm sure very few will continue using these Wikipedia tables as a reference and rather prefer external resources.
Case in point: some of the open source software I contributed to used the CP-12xx and ISO code page tables listed on Wikipedia as a resource to implement converters of legacy code pages to UTF-8. This unfortunate change makes it much harder to contribute additional tables in the future.
I already reverted the Sharp pocket computer character sets Wikipedia page that I've contributed to in the past and will do so again if necessary. For this second case in point: the Unicode equivalents are necessary to document Unicode-equivalent versions of Sharp pocket computer programs for historical documentation purposes, for which Wikipedia is an excellent resource.
This mass change serves no purpose but rather removes critical information. Please revert back to the "old style" tables as soon as possible without removing any information.
--Robert van Engelen 20:52, 9 December 2021 (UTC)
- No I will not revert this, it ADDS VITAL INFORMATION (the unicode code point names). And the colors made it impossible to use colors to indicate actual useful information (like differences and version changes). Any programmer who wants to mass-copy the unicode assignments can hit "edit" and work from there in a much more usable and convienent form than either the old or new table.
- These things were INCREDIBLY ugly and did not look like a table used by any other documentation in the world, and makes Wikipedia just look incompetent. I have been trying to fix these for years. Don't give me any nonsense about these changes being undesirable. You are wrong.Spitzak (talk) 21:19, 9 December 2021 (UTC)
- Dear Spitzak: I am fine to talk objectively about content quality improvements that make sense, not throwing unbacked claims at each other as your response is unnecessarily confrontational. I'm reflecting that I am getting complaints about this problem since people rely on these pages now (since these are linked from other external URLs).
- I agree that displaying both the names and Unicode code points is useful. However, the names and Unicode code points are not visible in the tables. The old style tables may not be "beautiful" to you specifically, but that is not a technical criterium. It is a subjective assessment. Robert van Engelen 20:52, 9 December 2021 (UTC)
- Can you please forward the exact wording of these complaints. Thanks you. Technically the new tables are superior, that is not a subjective assessment. They make it possible to highlight interesting information, they do not confuse people with numbers without a U+ prefix, and they provide useful information about the unicode code points so the user does not need to go to another source to look them up.Spitzak (talk) 22:16, 9 December 2021 (UTC)
- I think it's your subjective opinion. in my opinion, the new table is not superior at all. Some information you deleted like Unicode points were useful for some person like me... Please listen to and respect other people's opinions. Your attitude toward discussion is too dogmatic.--𝒞𝒽ℯℯ𝓈ℯ𝒹ℴℊ (talk) 00:49, 21 December 2021 (UTC)
I agree with Robert van Engelen's opinion. The new table is not informative, and a lot of useful information has been deleted. Why you think the new table as "Superior" and some useful information as "Vital"? I don't think so.--𝒞𝒽ℯℯ𝓈ℯ𝒹ℴℊ (talk) 00:45, 21 December 2021 (UTC)
- The Unicode code point numbers and the names are in the tooltips.
- What about for mobile users? They usually can't see tool tips.--𝒞𝒽ℯℯ𝓈ℯ𝒹ℴℊ (talk) 06:17, 21 December 2021 (UTC)
- There does need to be a way to view the code points without being able to see the tooltips, yes. This is a non-issue for the Unicode block charts—since the Unicode code point is just the location in the chart (as read from column and row headers). It is absolutely an issue for charts for coded character sets other than Unicode itself. --HarJIT (talk) 18:43, 21 December 2021 (UTC)
- Best I can suggest is to hit "edit" which does put the text in a more useful form for those who are transcribing to some kind of lookup table. I really think the only practical way to do this is to get the text into some other editor which can do macros to translate it to the form needed for programming. Nobody in their right mind would try to transcribe visually, don't be daft. I think the clarifying "U+" prefix and the name of the code point are absolutely vital information and I'm sorry but I cannot see any way to put this into a table that also allows a character to be quickly visually identified and/or compared to another table.
- The people who designed the Unicode code point charts also think the inability to see the number and name on mobile is a problem and they made several attempts to fix it. Mostly trying to get a footer to update with the most recently-clicked cell's information. The best example I saw caused the window to scroll vertically on any click. Also it seems to make it difficult to implement a link to the Wikipedia page for the character. I think if a solution is found it is equally important to fix those charts as well. I'm wondering if a popup box identical to how footnotes popup would work, those seem to allow markup and links, which is a big improvement over tooltips or those footer box attempts.Spitzak (talk) 19:31, 21 December 2021 (UTC)
- I should point out that the idea that somebody trying to write software and copying the code point numbers to a lookup table is using a mobile phone is really pretty ridiculous so the issue is somewhat moot. I don't like however that the mobile user cannot see the code point names, they are very informative and interesting.Spitzak (talk) 19:33, 21 December 2021 (UTC)
- Spitzak, dismissing our comments as "ridiculous" is not helping to converge to a solution as a team. Our comments are technical observations (not only opinions.) I should also add that printing the Wikipedia page and taking screenshots won't show the Unicode code points, in addition to the fact that tooltips are not easy or possible to trigger on a tablet or mobile phone. Tooltips are nice, but should not be relied on as the sole mechanism by which crucial information is shared with the community. Furthermore, some Unicode characters do not show up correctly with the "new tables format" font, see for example U+00B0 in the Sharp pocket computer character sets which incorrectly becomes a solid block in the "new tables format" instead of a light gray block that is correct. These observations and our opinions strongly argue for including the U+xxxx code points in the tables and leave the Unicode names to the tooltips, because these are not fixed width. Removing the U+xxxx code points is radical, deletes crucial information and not generally acceptable given the technical objections raised. Robert van Engelen 20:50, 21 December 2021 (UTC)
- B0 shows correctly as a dotted box for me. This may be a problem with the font you are using? Are you sure it is correct on your machine in the old table version?Spitzak (talk) 21:06, 21 December 2021 (UTC)
- Nobody is going to print the Wikipedia page in order to transcribe a Unicode lookup table, that is just as silly as claiming they will do that from a mobile phone. They will hit "edit" and copy and paste the text into their IDE, and edit it there into the form they want. This is true for both versions of the table.Spitzak (talk) 21:09, 21 December 2021 (UTC)
- Your repeated claims earlier that the "new tables" represent the Unicode Consortium's tables is false: just look at the front page https://home.unicode.org that clearly shows the character graphic with its U+xxxx code point. The Unicode Consortium considers the code point information critical. We do too. It makes no sense to hide the Unicode code points from view. Leaving the U+xxxx code points from the tables on Wikipedia invalidates the Wikipedia pages' claims, stating the tables show Unicode code points associated with the code pages, as we all expect the U+xxxx codes to be there. Also, why do you assume that people will "hit edit" to copy the Wikipedia source? It has a lot of stuff that is markup. That makes no sense. They rather will turn away from Wikipedia as a source that is lacking information and find an alternative source online. As for the font issues: symbols show up wrong on MacOS in Safari, Chrome and Firefox. I did not change fonts or anything. The U+00B0 symbol should be a light shade, lighter than U+00B1, but it is darker and almost solid.Why introduce a different font to begin with? That is asking for trouble. Robert van Engelen 02:47, 22 December 2021 (UTC)
- All Unicode tables have Unicode code points in the table: https://www.unicode.org/charts/PDF/U0080.pdf see Unicode Character Code charts Robert van Engelen 02:51, 22 December 2021 (UTC)
- That PDF certainly does have the number, with no U+, in the table. And that is the charts that they copied to make the dotted box around the control characters and I was told they insisted on the dotted box in order to match the PDF. I think I will have to experiment with getting that in there, but it is going to have to be *much* smaller and nearer the edge, similar to the PDF. What I want is the ability to easily compare adjacent glyphs. At least you seem to admit that the old colors were pointless, I really relly wanted to get rid of them.Spitzak (talk) 04:20, 22 December 2021 (UTC)
- The table font was changed to be the same font used for any table. The previous tables were a bit inconsistent about this, there were attempts to make it a serif font but in many cases people added markup to the cells to switch it back.Spitzak (talk) 04:22, 22 December 2021 (UTC)
- That PDF certainly does have the number, with no U+, in the table. And that is the charts that they copied to make the dotted box around the control characters and I was told they insisted on the dotted box in order to match the PDF. I think I will have to experiment with getting that in there, but it is going to have to be *much* smaller and nearer the edge, similar to the PDF. What I want is the ability to easily compare adjacent glyphs. At least you seem to admit that the old colors were pointless, I really relly wanted to get rid of them.Spitzak (talk) 04:20, 22 December 2021 (UTC)
- Spitzak, dismissing our comments as "ridiculous" is not helping to converge to a solution as a team. Our comments are technical observations (not only opinions.) I should also add that printing the Wikipedia page and taking screenshots won't show the Unicode code points, in addition to the fact that tooltips are not easy or possible to trigger on a tablet or mobile phone. Tooltips are nice, but should not be relied on as the sole mechanism by which crucial information is shared with the community. Furthermore, some Unicode characters do not show up correctly with the "new tables format" font, see for example U+00B0 in the Sharp pocket computer character sets which incorrectly becomes a solid block in the "new tables format" instead of a light gray block that is correct. These observations and our opinions strongly argue for including the U+xxxx code points in the tables and leave the Unicode names to the tooltips, because these are not fixed width. Removing the U+xxxx code points is radical, deletes crucial information and not generally acceptable given the technical objections raised. Robert van Engelen 20:50, 21 December 2021 (UTC)
- There does need to be a way to view the code points without being able to see the tooltips, yes. This is a non-issue for the Unicode block charts—since the Unicode code point is just the location in the chart (as read from column and row headers). It is absolutely an issue for charts for coded character sets other than Unicode itself. --HarJIT (talk) 18:43, 21 December 2021 (UTC)
- What about for mobile users? They usually can't see tool tips.--𝒞𝒽ℯℯ𝓈ℯ𝒹ℴℊ (talk) 06:17, 21 December 2021 (UTC)
- Nope, putting the number at the bottom of the cell puts this back to unreadable. I have not found any way to get the gap between the glyph and number smaller so the table is a reasonable size. More importantly this does not work for glyphs that are more than one unicode code point, are not in unicode, or when there are alternatives. I also found I have to duplicate the number in the tooltip, which seems wrong, this is because without the "U+xxxx" prefix it is unclear what those all-caps words mean. In addition it is apparent the Unicode PDFs are using this number as a table index, as it is showing it on unassigned and invalid cells, which I think a lot of people think it is as there were continuous edits to 'fix' these to be the cell number rather than the Unicode code point.
- I would like an explicit clear indication of exactly how you have used this number. I have used them, but to compare tables or to locate mistakes in them, and in all cases I have had to use "edit" and copy and paste the text to another editor in order to do anything useful. The visible number is absolutely useless for this.
- A new idea is to make a huge, collapsable, table attached to the bottom of the cell grid. This would have one line per character and the Unicode code point and name, and possibly other information like the Alt code. That would certainly be easier to read. An annoyance is that the information is duplicated, although possibly the tooltips could be removed from the grid. What do you think?Spitzak (talk) 18:59, 22 December 2021 (UTC)
- We all agreed that the code points are essential information. You admitted that the code points should be there. There is no need to have a U+ prefix or Unicode names in the table if that is your concern. Just the 4 or 5 hex digit code suffices, just like the Unicode consortium uses in their documentation. If you cannot figure out how to place the code points into the tables, then restoring the original tables will be the only reasonable option I'm afraid. Please do not create more complications by considering collapsable tables. Keep the tables as simple and understandable (as they used to be). That's all I'm asking for and others appear to agree with me. Robert van Engelen 21:48, 9 January 2022 (UTC)
- As I said before, it is clear they are using the numbers as a table *index*, not as a result in the table. This is in fact what most people expect from a small number printed on the edge of a box. In addition the numbers are not there is hundreds and hundreds of tables of characters on Wikipedia. I have yet to receive an explanation of how this "essential" information is used, as well. What exactly do you do with that number, other than look up the Unicode code point name, something the new tables has done for the reader.Spitzak (talk) 23:06, 9 January 2022 (UTC)
- As I said before, it is clear they are using the numbers as a table *index*, not as a result in the table. This is in fact what most people expect from a small number printed on the edge of a box. In addition the numbers are not there is hundreds and hundreds of tables of characters on Wikipedia. I have yet to receive an explanation of how this "essential" information is used, as well. What exactly do you do with that number, other than look up the Unicode code point name, something the new tables has done for the reader.Spitzak (talk) 23:06, 9 January 2022 (UTC)
- We all agreed that the code points are essential information. You admitted that the code points should be there. There is no need to have a U+ prefix or Unicode names in the table if that is your concern. Just the 4 or 5 hex digit code suffices, just like the Unicode consortium uses in their documentation. If you cannot figure out how to place the code points into the tables, then restoring the original tables will be the only reasonable option I'm afraid. Please do not create more complications by considering collapsable tables. Keep the tables as simple and understandable (as they used to be). That's all I'm asking for and others appear to agree with me. Robert van Engelen 21:48, 9 January 2022 (UTC)