Discussion:
Convert those dastardly curly quotes to straight quotes on Windows?
(too old to reply)
harry newton
2017-10-07 21:38:03 UTC
Permalink
Raw Message
How can we convert those dastardly curly quotes to straight quotes on Windows?
<Loading Image...>

I like to save into TEXT files on Windows technical information cut and
pasted from disjoint news articles where the unprintable curly quotes drive
me nuts!

Here is a screenshot of a sample cut and paste:
<http://i67.tinypic.com/2h5mjbr.jpg>

I tried cutting from the web and pasting into MS Word and then cutting from
MS Word and pasting into the text file - but the dastardly curly quotes
were still there.

I tried using Google Gmail, pasting into a composition window and then
hitting the "Tx" format text button, and even changing the font to some
other font, but the dastardly curly quotes were still there.

Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?

Here's just one sample but the web is filled with dastardly curly quotes!
<http://theverge.com/2017/10/6/16437790/iphone-8-swollen-battery-issue-apple-investigating>
Auric__
2017-10-07 22:57:53 UTC
Permalink
Raw Message
Post by harry newton
How can we convert those dastardly curly quotes to straight quotes on Windows?
<http://i67.tinypic.com/2h5mjbr.jpg>
I like to save into TEXT files on Windows technical information cut and
pasted from disjoint news articles where the unprintable curly quotes
drive me nuts!
<http://i67.tinypic.com/2h5mjbr.jpg>
I tried cutting from the web and pasting into MS Word and then cutting
from MS Word and pasting into the text file - but the dastardly curly
quotes were still there.
I tried using Google Gmail, pasting into a composition window and then
hitting the "Tx" format text button, and even changing the font to some
other font, but the dastardly curly quotes were still there.
Since almost every technical web site uses the dastardly curly quotes,
how can I just get *rid* of them using a Windows method so that I can
have a text file that contains normal quotes?
Here's just one sample but the web is filled with dastardly curly quotes!
<http://theverge.com/2017/10/6/16437790/iphone-8-swollen-battery-issue-ap
ple-investigating>
Copy the text to your text editor (or if already in a text file, open the
file). Select a "curly quote", copy it. Replace all, paste the copied curly
into the "find this" box, and then type a regular quote in the "replace it
with this" box, replace all. Repeat for the "close curly quotes".
--
You looking at my hog? Don't look at my hog... or my motorcycle.
harry newton
2017-10-07 23:23:34 UTC
Permalink
Raw Message
Post by Auric__
Copy the text to your text editor (or if already in a text file, open the
file). Select a "curly quote", copy it. Replace all, paste the copied curly
into the "find this" box, and then type a regular quote in the "replace it
with this" box, replace all. Repeat for the "close curly quotes".
I should have mentioned that the curly quotes are just the tip of the
iceberg, and even they have "opening" and "closing" curly quotes, and even
then, they have "single" and "double" curly quotes ... and there's lots
more of this "curly-quote" stuff, so cutting and pasting isn't even close
to a solution.

I'm looking for a program that just does away with all non-standard
"us-ascii" characters that aren't on a typical American US English
keyboard.

The use model requested is:
a. Copy dastardly bastardized text (which is most web pages)
b. Paste into this Jesus program (which absolves all curly-quote sins)
c. Then cut and paste out of that Jesus program into the text file or
Usenet post.

For more on those sinful dastard non-standard-character abominations, see:
<https://practicaltypography.com/straight-and-curly-quotes.html>
<https://www.theatlantic.com/technology/archive/2016/12/quotation-mark-wars/511766/>
etc.

Maybe this will work, but it's ponderous:
<https://support.office.com/en-us/article/Change-curly-quotes-to-straight-quotes-and-vice-versa-017963A0-BC5F-486B-9C9D-0EC511A8FB8F
Mayayana
2017-10-08 03:16:53 UTC
Permalink
Raw Message
"harry newton" <***@is.invalid> wrote

| > Copy the text to your text editor (or if already in a text file, open
the
| > file). Select a "curly quote", copy it. Replace all, paste the copied
curly
| > into the "find this" box, and then type a regular quote in the "replace
it
| > with this" box, replace all. Repeat for the "close curly quotes".
|
| I should have mentioned that the curly quotes are just the tip of the
| iceberg, and even they have "opening" and "closing" curly quotes, and even
| then, they have "single" and "double" curly quotes ... and there's lots
| more of this "curly-quote" stuff, so cutting and pasting isn't even close
| to a solution.
|
| I'm looking for a program that just does away with all non-standard
| "us-ascii" characters that aren't on a typical American US English
| keyboard.
|
I can't see your tinypic links. Apparently they
require script. But I know what you mean. That also
drives me crazy. It's an entirelty unnecessary
complication.
Auric's solution is the most realistic. I know there
are different characters, but usually not many.
Two kinds of curly quotes and unicode white
space are the most common. There's no way to
make a generic program to treat all possibilities
because you're substituting ANSI characters for
UTF-8. The possibilities go into the thousands.
If you just save as ANSI and then replace anything
funky, it's not too bad. Otherwise, you can just save
the text file as UTF-8.

I agree with you. There's no reason to use curly
quotes. Using the ASCII versions means not needing
to use UTF-8 encoding. If UTF-8 were really necessary
it would be different, but most of the world lives by
ANSI. And webpages in European languages work just
fine with ANSI. Microsoft is one of the worst for that
problem. They write pages intended for an English-speaking
audience, in English, then use just a handful of unnecessary
UTF-8 characters that break the ANSI continuity. It makes
no sense.
pyotr filipivich
2017-10-08 04:15:58 UTC
Permalink
Raw Message
Post by Mayayana
I agree with you. There's no reason to use curly
quotes. Using the ASCII versions means not needing
to use UTF-8 encoding. If UTF-8 were really necessary
it would be different, but most of the world lives by
ANSI. And webpages in European languages work just
fine with ANSI. Microsoft is one of the worst for that
problem. They write pages intended for an English-speaking
audience, in English, then use just a handful of unnecessary
UTF-8 characters that break the ANSI continuity. It makes
no sense.
IMOSHO, it makes no sense, but then it is Microsoft. Which often
seem to have a lot of "I'm sure it makes sense - not to me, but to
someone" elements.
--
pyotr filipivich
Next month's Panel: Graft - Boon or blessing?
Mayayana
2017-10-08 14:41:02 UTC
Permalink
Raw Message
"pyotr filipivich" <***@mindspring.com> wrote

|>Microsoft is one of the worst for that
| >problem. They write pages intended for an English-speaking
| >audience, in English, then use just a handful of unnecessary
| >UTF-8 characters that break the ANSI continuity. It makes
| >no sense.
|
| IMOSHO, it makes no sense, but then it is Microsoft. Which often
| seem to have a lot of "I'm sure it makes sense - not to me, but to
| someone" elements.
|

That's a generous view. I don't see a problem
with switching to UTF-8, but what MS are doing is
to deliberately and unnecessarily break ASCII
compatibility without any need to do so, by replacing
quotes and spaces with unicode characters in UTF-8.
It seems to be a kind of political correctness attitude.
Nearly all English pages can easily be both ASCII and
UTF-8.

I wonder how journalists type those quotes. Maybe
they have a software program that does the conversion?
Jason
2017-10-07 23:06:29 UTC
Permalink
Raw Message
Post by harry newton
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
J. P. Gilliver (John)
2017-10-07 23:12:05 UTC
Permalink
Raw Message
Post by Jason
Post by harry newton
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
They (the straight quotes) preceded ASCII and probably EBCDIC by quite
some time - they're on my old Imperial typewriter ...
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)***@T+H+Sh0!:`)DNAf

The thing about smut is it harms no one and it's rarely cruel. Besides, it's a
gleeful rejection of the dreary and the "correct".
- Alison Graham, RT 2014/10/25-31
Ken Blake
2017-10-07 23:19:19 UTC
Permalink
Raw Message
On Sun, 8 Oct 2017 00:12:05 +0100, "J. P. Gilliver (John)"
Post by J. P. Gilliver (John)
Post by Jason
Post by harry newton
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
They (the straight quotes) preceded ASCII and probably EBCDIC by quite
some time - they're on my old Imperial typewriter ...
Yes, that's because they use only a single key. Curly quotes (the
normal quotes, as Jason said) would require two keys.
harry newton
2017-10-07 23:35:14 UTC
Permalink
Raw Message
Post by Ken Blake
Post by J. P. Gilliver (John)
Post by Jason
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
They (the straight quotes) preceded ASCII and probably EBCDIC by quite
some time - they're on my old Imperial typewriter ...
Yes, that's because they use only a single key. Curly quotes (the
normal quotes, as Jason said) would require two keys.
I stand completly humbled before you as I admit that curly quotes (and all
that other sinful related stuff that isn't on a US American computer
keyboard) came first, simply because, well, typography predates computers.

OK. So curly stuff came first. I do admit that fact.
But it's not just curly quotes. There's tons more stuff in web pages that
just don't render in text. The curly quote is just the tip of the iceberg.

All I want is so simple that I can't believe anyone wouldn't want it.
I cut and paste web text into a text file as I research stuff.
I just want all the pasted text to be visible characters, not black boxes.

What program or method easily converts all that curly typography stuff to
just the character set that ASCII Windows editors like Gvim use?
Stefan Ram
2017-10-08 00:07:10 UTC
Permalink
Raw Message
Post by harry newton
What program or method easily converts all that curly typography stuff to
just the character set that ASCII Windows editors like Gvim use?
For example, lynx:

$ echo "abc»def«ghi" >tmp.html
$ lynx -dump -display_charset=US-ASCII tmp.html
abc>>def<<ghi

or »recode«:

$ cat tmp.html
abc»def«ghi
$ recode latin1..US_ASCII tmp.html
$ cat tmp.html
abc>>def<<ghi

. (Both programs can be found under UN*X more often than
under WIND*S.)
Wolf K
2017-10-08 00:58:20 UTC
Permalink
Raw Message
On 2017-10-07 19:35, harry newton wrote:
[...]
Post by harry newton
All I want is so simple that I can't believe anyone wouldn't want it.
[...]

Well, I don't want it. I guess I'm ab/subnormal. :-)
--
Wolf K
kirkwood40.blogspot.com
"Wanted. Schrödinger’s Cat. Dead and Alive."
David Kleinecke
2017-10-08 01:03:50 UTC
Permalink
Raw Message
Post by harry newton
Post by Ken Blake
Post by J. P. Gilliver (John)
Post by Jason
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
They (the straight quotes) preceded ASCII and probably EBCDIC by quite
some time - they're on my old Imperial typewriter ...
Yes, that's because they use only a single key. Curly quotes (the
normal quotes, as Jason said) would require two keys.
I stand completly humbled before you as I admit that curly quotes (and all
that other sinful related stuff that isn't on a US American computer
keyboard) came first, simply because, well, typography predates computers.
OK. So curly stuff came first. I do admit that fact.
But it's not just curly quotes. There's tons more stuff in web pages that
just don't render in text. The curly quote is just the tip of the iceberg.
All I want is so simple that I can't believe anyone wouldn't want it.
I cut and paste web text into a text file as I research stuff.
I just want all the pasted text to be visible characters, not black boxes.
What program or method easily converts all that curly typography stuff to
just the character set that ASCII Windows editors like Gvim use?
I think sed would do the job provided you are sure exactly
what you want replaced by what. I used it once to
transliterate Arabic script into Latin script. That was
Unicode Arabic into one or two ASCII characters.
harry newton
2017-10-07 23:30:52 UTC
Permalink
Raw Message
Post by Jason
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
Depends on which side of the fence you live on.
Has the Internet Killed Curly Quotes?
<https://www.theatlantic.com/technology/archive/2016/12/quotation-mark-wars/511766/>

But I was just using "curly quotes" as just one of maybe a dozen or more
common dastardly abominations which just don't translate into text on
Windows, as shown in this simple example from Butterick's Practical
Typography:
<https://practicaltypography.com/index.html#toc>
Where curly quotes are just one of many evils:
<https://practicaltypography.com/straight-and-curly-quotes.html>

The problem is that my text editor (Gvim) isn't handling the dastardly
characters, so all I want to do is get rid of any character that any normal
text editor can't/won't/doesn't handle.

This ponderous Microsoft Office approach might work - but I'm hoping for a
far simpler and less monotlithic solution to the basic problem that
everyone should have if they cut and paste into text from the web.
<https://support.office.com/en-us/article/Change-curly-quotes-to-straight-quotes-and-vice-versa-017963A0-BC5F-486B-9C9D-0EC511A8FB8F>
Stefan Ram
2017-10-07 23:57:54 UTC
Permalink
Raw Message
Post by harry newton
The problem is that my text editor (Gvim) isn't handling the dastardly
characters, so all I want to do is get rid of any character that any normal
text editor can't/won't/doesn't handle.
If you can load the text into Gvim, and the text just is not
rendered correctly:

To get you started, here are some example vim functions. You
would write these into your .exrc or .vimrc (or .gvimrc?) file:

function! example()
execute "silent! %s/" . nr2char( 123 ) . nr2char( 132 ) . "/A/g"
execute "silent! %s/B/C/g"
endfunction

This function replaces all occurrences of code 123 that are
followed by code 132 with "A" and then replaces all
occurrences of "B" by "C".

You can then call this from the (g)vim command line via:

call example()

or bind it to a key-sequence (in your .exrc or some such):

map ,x :call example()<CR>

. Now the key-sequence »,x« will call the function example.

Well, you need to figure out the details, i.e., the codes of
the characters you wish to replace. I can't give you those
codes, since they might depend on some local details of you
configuration and on which quotes exactly you wish to replace.

If you can't load the text into Gvim:

You also can search and replace characters in Microsoft
Word, and then record this as a VBA macro and have it be
replayed upon a keypress. (However, searching specific quotes
with Word is tricky.)

Otherwise: If you save the text to a file, nearly every
programming language, like - for example - Python, will
allow you to replace the characters in the file.
J. P. Gilliver (John)
2017-10-08 09:17:22 UTC
Permalink
Raw Message
In message <orbo3b$1jb2$***@gioia.aioe.org>, harry newton
<***@is.invalid> writes:
[]
Post by harry newton
The problem is that my text editor (Gvim) isn't handling the dastardly
characters, so all I want to do is get rid of any character that any normal
text editor can't/won't/doesn't handle.
[]
Of course, some would (and will) say why are you using a text editor
(probably inserting the word "still", to imply you're a dinosaur), but I
_do_ see the problem, and like you am surprised that no-one has written
a "simplify" utility that does what you want. (Or if they have, that
no-one has mentioned it.)

One _slight_ problem that might be encountered (though I would think it
could be overcome): non-standardisation. I've encountered - I can't say
in web pages, but I get it in emails/news posts that my elderly software
doesn't display as the originator intended. For example, let's take
those quotes you hate so much: sometimes, they've used something that my
software _does_ render as (in my case sloping rather than curly) quotes;
sometimes, they've used something that my software renders as some other
character (superscripted 2 and 3 are common); and sometimes they've used
something that just renders as a little rectangle, which is my
software's way of showing "unknown character" or something.

OK, you may say: the "simplify" utility could still handle this: it
would just have to render _all_ those possibilities into a quote
character (ASCII 34 decimal, "shifted 2" on some keyboards [as it's the
code for "2" with a bit changed - though I don't think it _is_ shifted 2
on the US layout]). But the potential problem comes when one example -
let's say font, though that's oversimplifying - uses code X for a quote
(albeit sloping/curly), but another one uses the _same_ code X for
something else: a space, say, or a ^, or %, or _. How would your
"simplify" utility know which to substitute? (But as I've said, I'm sure
it could be got over - perhaps by having it look at headers.)
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)***@T+H+Sh0!:`)DNAf

As we journey through life, discarding baggage along the way, we should keep
an iron grip, to the very end, on the capacity for silliness. It preserves the
soul from desiccation. - Humphrey Lyttelton quoted by Barry Cryer in Radio
Times 10-16 November 2012
Wolf K
2017-10-08 00:55:46 UTC
Permalink
Raw Message
Post by Jason
Post by harry newton
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Curly quotes (dastardly) are "normal" quotes. The straight quotes were
ASCII (and EBCDIC) excuses for "real" (dastardly, curly) quotes...
Same problem on typewriters, which is why some people think the curly
quotes aren't normal.
--
Wolf K
kirkwood40.blogspot.com
"Wanted. Schrödinger’s Cat. Dead and Alive."
Mike S
2017-10-08 05:30:46 UTC
Permalink
Raw Message
Post by harry newton
How can we convert those dastardly curly quotes to straight quotes on Windows?
<http://i67.tinypic.com/2h5mjbr.jpg>
I like to save into TEXT files on Windows technical information cut and
pasted from disjoint news articles where the unprintable curly quotes drive
me nuts!
<http://i67.tinypic.com/2h5mjbr.jpg>
I tried cutting from the web and pasting into MS Word and then cutting from
MS Word and pasting into the text file - but the dastardly curly quotes
were still there.
I tried using Google Gmail, pasting into a composition window and then
hitting the "Tx" format text button, and even changing the font to some
other font, but the dastardly curly quotes were still there.
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Here's just one sample but the web is filled with dastardly curly quotes!
<http://theverge.com/2017/10/6/16437790/iphone-8-swollen-battery-issue-apple-investigating>
Looks like you may be able to do this withing Word.

How to change smart or curly quotes to straight quotes in Microsoft Word
On the File tab, click Options.
Click Proofing, and then click AutoCorrect Options.
In the AutoCorrect dialog box, do the following:
- Click the AutoFormat As You Type tab, and under Replace as you type,
select or clear the "Straight quotes" with “smart quotes” check box.
- Click the AutoFormat tab, and under Replace, select or clear the
"Straight quotes" with “smart quotes” check box.
Click OK.
https://www.windowscentral.com/change-smart-quotes-straight-quotes-microsoft-office-word-outlook-powerpoint

Change curly quotes to straight quotes and vice versa
https://support.office.com/en-us/article/Change-curly-quotes-to-straight-quotes-and-vice-versa-017963a0-bc5f-486b-9c9d-0ec511a8fb8f

Replacing smart quotes with regular quotes
https://superuser.com/questions/1054418/replacing-smart-quotes-with-regular-quotes

Does this approach do what you need?
Whiskers
2017-10-08 11:54:27 UTC
Permalink
Raw Message
Post by harry newton
How can we convert those dastardly curly quotes to straight quotes on Windows?
<http://i67.tinypic.com/2h5mjbr.jpg>
I like to save into TEXT files on Windows technical information cut and
pasted from disjoint news articles where the unprintable curly quotes drive
me nuts!
<http://i67.tinypic.com/2h5mjbr.jpg>
I tried cutting from the web and pasting into MS Word and then cutting from
MS Word and pasting into the text file - but the dastardly curly quotes
were still there.
I tried using Google Gmail, pasting into a composition window and then
hitting the "Tx" format text button, and even changing the font to some
other font, but the dastardly curly quotes were still there.
Since almost every technical web site uses the dastardly curly quotes, how
can I just get *rid* of them using a Windows method so that I can have a
text file that contains normal quotes?
Here's just one sample but the web is filled with dastardly curly quotes!
<http://theverge.com/2017/10/6/16437790/iphone-8-swollen-battery-issue-apple-investigating>
I think the problem isn't that some quotes are curly (which is what they
should be), but that some documents and web pages are generated using
software that ignores the standard way of coding such characters - so
that a copy/paste into standard-observing software reveals the
discrepancies.

Perhaps "demoroniser - correct moronic and gratuitously incompatible
Microsoft HTML" <http://www.fourmilab.ch/webtools/demoroniser/> is what
you're looking for. That page explains the problem as understood by the
author, and offers his solution as a Perl program. I don't know if (or
how) that works on Windows systems.
--
-- ^^^^^^^^^^
-- Whiskers
-- ~~~~~~~~~~
J. P. Gilliver (John)
2017-10-08 12:47:24 UTC
Permalink
Raw Message
In message <***@ID-107770.user.individual.net>,
Whiskers <***@operamail.com> writes:
[]
Post by Whiskers
I think the problem isn't that some quotes are curly (which is what they
should be), but that some documents and web pages are generated using
[Who says they should (-:?]
Post by Whiskers
software that ignores the standard way of coding such characters - so
that a copy/paste into standard-observing software reveals the
discrepancies.
The OP wants to use a plain-text editor that only uses standard ASCII
(not "extended ASCII", or codes - i. e. characters between 32 and 126
decimal [plus newline]). He hasn't said why yet, but I understand what
he wants. (I was going to say "... like Notepad", but Notepad does allow
so-called "Extended ASCII", i. e. one particular set of the codes up to
255.) He is hoping for something that will render such text into
nearest-equivalent (such as quotes that have directional qualities all
into code 34 decimal).
[]
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)***@T+H+Sh0!:`)DNAf

"In the _car_-park? What are you doing there?" "Parking cars, what else does
one
do in a car-park?" (First series, fit the fifth.)
Loading...