Poorly hyphenated words

I saw a word that I think was poorly hyphenated in justified Page view so I thought I would start a thread where others could post oddities that they see. I can't remember which resource it was but the word was "Well" hyphenated wel-l, with the second "L" appearing on the second line alone. It seemed weird to me.

Comments

  • Rosie Perera
    Rosie Perera Member Posts: 26,202 ✭✭✭✭✭

    Hmm, they probably have a hyphenation rule that says you can break a word between double L's. That would be correct if the double L's were in a multi-syllable word like fel·low or syl·la·ble. But you can't hyphenate a single syllable word. And you can't split double L's in certain double syllable words like well·ness. Hyphenation rules are not trivial!

    EDIT: It doesn't seem to be a great idea to have all the hyphenation oddities listed in a thread, since it will be easily lost and hard to find again. And there's no guarantee Logos will keep looking at it on into the future as more reports come in on it. Better to start a page of these on the wiki, along the same lines as the metadata correction proposals, books missing pagination and other indexes, etc.

  • George Somsel
    George Somsel Member Posts: 10,153 ✭✭✭


    Hmm, they probably have a hyphenation rule that says you can break a word between double L's. That would be correct if the double L's were in a multi-syllable word like fel·low or syl·la·ble. But you can't hyphenate a single syllable word. And you can't split double L's in certain double syllable words like well·ness. Hyphenation rules are not trivial!


    That's what happens when you have non-native English speakers attempting to get the text transcribed in a hurry.

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • Rosie Perera
    Rosie Perera Member Posts: 26,202 ✭✭✭✭✭


    Hmm, they probably have a hyphenation rule that says you can break a word between double L's. That would be correct if the double L's were in a multi-syllable word like fel·low or syl·la·ble. But you can't hyphenate a single syllable word. And you can't split double L's in certain double syllable words like well·ness. Hyphenation rules are not trivial!


    That's what happens when you have non-native English speakers attempting to get the text transcribed in a hurry.

    I doubt it has anything to do with hurried text transcription. I bet they did not implement this in a standard way like many digital texts do, with invisible optional hyphen characters, but rather on the fly at text display time, using a set of hyphenation rules.  Otherwise they would have had to go back through all of the old resources and add those optional hyphens, which they obviously did not do recently. And I doubt they've been putting them in all along "just in case." Furthermore, when I tried copying a word that had been hyphenated across the end of a line and pasting it into Word, I found that it did not have the invisible optional hyphen in it. You'd think that would have been copied as well if were actually there.

  • George Somsel
    George Somsel Member Posts: 10,153 ✭✭✭


    I doubt it has anything to do with hurried text transcription. I bet they did not implement this in a standard way like many digital texts do, with invisible optional hyphen characters, but rather on the fly at text display time, using a set of hyphenation rules.  Otherwise they would have had to go back through all of the old resources and add those optional hyphens, which they obviously did not do recently. And I doubt they've been putting them in all along "just in case." Furthermore, when I tried copying a word that had been hyphenated across the end of a line and pasting it into Word, I found that it did not have the invisible optional hyphen in it. You'd think that would have been copied as well if were actually there.

    It's fairly obvious that they don't use optional hyphens since I sometimes find hyphens in the middle of words which are in the middle of a line of text.  They are hard-coded. 

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • Rosie Perera
    Rosie Perera Member Posts: 26,202 ✭✭✭✭✭

    It's fairly obvious that they don't use
    optional hyphens since I sometimes find hyphens in the middle of words
    which are in the middle of a line of text.  They are hard-coded. 

    Oh yes, there's another proof to add to the other two I suggested. I presume you're talking about when you find hyphens which don't belong.
    There's a difference between hard-coded hyphens (like the one I just
    used in "hard-coded") and optional hyphens which should only show up
    when the word is split across a line break. Logos does not appear to
    deal correctly with the latter when they appear in text. (At least they
    didn't before the new justification/hyphenation feature was released;
    the times I've stumbled upon the stray hyphen here or there in the past
    were before that.)

    Still, all this goes to show it has nothing to do with non-native English speakers rushing to transcribe the text. It's a coding error. English is full of so many spelling exceptions that I'm not sure it is possible to successfully code rules for when to hyphenate. I suppose they could include an entire hyphenation dictionary, like what Word does. I didn't notice such a thing getting downloaded, though. So I really don't know how they do it.

  • George Somsel
    George Somsel Member Posts: 10,153 ✭✭✭


    Still, all this goes to show it has nothing to do with non-native English speakers rushing to transcribe the text. It's a coding error. English is full of so many spelling exceptions that I'm not sure it is possible to successfully code rules for when to hyphenate. I suppose they could include an entire hyphenation dictionary, like what Word does. I didn't notice such a thing getting downloaded, though. So I really don't know how they do it.

    I still think it is due to non-English speakers transcribing the texts.  They apparently don't understand that a hyphen may be in the text at the end of a line but should not be entered into the electronic text since it would not fall at the end of a line (or would do so only rarely).  Besides, I recall that there were optional hyphens in WordPerfect, but I don't recall seeing them in MS Word.

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • Rosie Perera
    Rosie Perera Member Posts: 26,202 ✭✭✭✭✭

    I still think it is due to non-English speakers transcribing the texts.  They apparently don't understand that a hyphen may be in the text at the end of a line but should not be entered into the electronic text since it would not fall at the end of a line (or would do so only rarely).  Besides, I recall that there were optional hyphens in WordPerfect, but I don't recall seeing them in MS Word.

    That could explain the problem of spurious hyphens appearing in the middle of lines where they don't belong. But it would not explain the problem of wel-l being hyphenated.

    MS Word does use optional hyphens, and you can turn on/off the visibility of them:

    image

    You can manually insert an optional hyphen in Word by pressing Ctrl+hyphen. Then it will appear only if the word is at the end of the line and won't all fit, or if you turn on the visibility of optional hyphens in the above dialog box.

    image

    Normally, Word is smart enough to know how to hyphenate words when they are at the end of the line (it has a hyphenation dictionary), but if you need to tell it how to hyphenate a word that it doesn't seem to have in its dictionary, that's how you'd do it.

    I never use justified text anyway, so I don't use the hyphenation feature.

    The times I have seen spurious hyphens in the middle of words in Logos they have looked like Word's optional hyphens, a short horizontal line with a little hook going down at a right angle. So that is why I suspect they really were meant to be optional hyphens in the original source documents, and Logos isn't handling them properly (making them invisible if they appear in the middle of a line). I know that there are non-native English speakers keying in texts, but they are given dictionaries to use and rules to follow for stuff like this.

  • George Somsel
    George Somsel Member Posts: 10,153 ✭✭✭


    That could explain the problem of spurious hyphens appearing in the middle of lines where they don't belong. But it would not explain the problem of wel-l being hyphenated.

    Sometimes people simply make mistreaks (except for me, of course).  [;)]

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • Clinton Thomas
    Clinton Thomas Member Posts: 465

    The definition of mistreak "Accidentally running naked in public caused by a wardrobe malfunction"

    and how that applies here, I'll never understand.

     

  • Melissa Snyder
    Melissa Snyder Member Posts: 4,702 ✭✭✭


    I saw a word that I think was poorly hyphenated in justified Page view so I thought I would start a thread where others could post oddities that they see. I can't remember which resource it was but the word was "Well" hyphenated wel-l, with the second "L" appearing on the second line alone. It seemed weird to me.

    Philip ~  This is a bug, as our hyphenation code requires at least three characters on either side of the hyphen break. That means the word “well” should never hyphenate. We'll try to find a repro, but if you happen to recall the steps to reproduce it, please let us know.

  • spitzerpl
    spitzerpl Member Posts: 4,998

    Philip ~  This is a bug, as our hyphenation code requires at least three characters on either side of the hyphen break. That means the word “well” should never hyphenate. We'll try to find a repro, but if you happen to recall the steps to reproduce it, please let us know.

    I'll keep an eye out for anytime I see it happen in the future.
  • Rosie Perera
    Rosie Perera Member Posts: 26,202 ✭✭✭✭✭

    our hyphenation code requires at least three characters on either side of the hyphen break.

    That  seems  ex-
    cessively rigorous.

  • SteveF
    SteveF Member Posts: 1,866 ✭✭✭

    the word was "Well" hyphenated wel-l, with the second "L" appearing on the second line alone.

    Philip ~  This is a bug, as our hyphenation code requires at least three characters on either side of the hyphen break

    my example was similar to that of Phillip. I had "A Christmas Carol in "reading" view  -- it was in two columns with the TOC showing. One of the words did the same thing at the end of the line, ie there was a hyphen and then (1) letter showed on the next line.

    Thanks

     

    Regards, SteveF

  • George Somsel
    George Somsel Member Posts: 10,153 ✭✭✭

    SteveF said:


    the word was "Well" hyphenated wel-l, with the second "L" appearing on the second line alone.

    Philip ~  This is a bug, as our hyphenation code requires at least three characters on either side of the hyphen break

    my example was similar to that of Phillip. I had "A Christmas Carol in "reading" view  -- it was in two columns with the TOC showing. One of the words did the same thing at the end of the line, ie there was a hyphen and then (1) letter showed on the next line.

    Thanks

    Would you call that a "wel-l hyphenated" word?  [H]

    george
    gfsomsel

    יְמֵי־שְׁנוֹתֵינוּ בָהֶם שִׁבְעִים שָׁנָה וְאִם בִּגְבוּרֹת שְׁמוֹנִים שָׁנָה וְרָהְבָּם עָמָל וָאָוֶן

  • Melissa Snyder
    Melissa Snyder Member Posts: 4,702 ✭✭✭

    SteveF said:


    my example was similar to that of Phillip. I had "A Christmas Carol in "reading" view  -- it was in two columns with the TOC showing. One of the words did the same thing at the end of the line, ie there was a hyphen and then (1) letter showed on the next line.

    Thanks--I'll see if I can repro.

  • SteveF
    SteveF Member Posts: 1,866 ✭✭✭

    Would you call that a "wel-l hyphenated" word?  Cool

    Thank you, George !!!

     

    Regards, SteveF

This discussion has been closed.