Microsoft crazy facts
MAGIC #1
Nobody can create a FOLDER anywhere on the computer which can be named as "CON".
This is something pretty cool...and unbelievable. ..
At Microsoft the whole Team, couldnt answer why this happened!
TRY IT NOW, IT WILL NOT CREATE "CON" FOLDER
MAGIC #2
This is something pretty cool and neat...and unbelievable. ..
At Microsoft the whole Team, including Bill Gates, couldnt answer why this happened!
Try it out yourself...
Open Microsoft Word and type
=rand (200, 99)
and then press ENTER
MAGIC #3
For those of you using Windows, do the following:
1. Open an empty notepad file
2. Type "Bush hid the facts" (without the quotes)
3. Save it as whatever you want.
4. Close it, and re-open it.
is it just a really weird bug?
You can try the same thing above with another sentence "this app can break"
Explanation for Magic #1:
In windows the folder name and the special system variables share the same interface, so when you create a folder with a system variable name it will consider that folder already exist!!
these special system variables are available irrespective of path
You cannot create a folder with these names also:
CON, NUL, COM1, COM2, COM3, LPT1, LPT2, LPT3,COM1 to COM9 and LPT1 to LPT9....
CON means console, COM1 means serial port 1, LPT1 means parallel port 1
From Pakistan, Karachi
MAGIC #1
Nobody can create a FOLDER anywhere on the computer which can be named as "CON".
This is something pretty cool...and unbelievable. ..
At Microsoft the whole Team, couldnt answer why this happened!
TRY IT NOW, IT WILL NOT CREATE "CON" FOLDER
MAGIC #2
This is something pretty cool and neat...and unbelievable. ..
At Microsoft the whole Team, including Bill Gates, couldnt answer why this happened!
Try it out yourself...
Open Microsoft Word and type
=rand (200, 99)
and then press ENTER
MAGIC #3
For those of you using Windows, do the following:
1. Open an empty notepad file
2. Type "Bush hid the facts" (without the quotes)
3. Save it as whatever you want.
4. Close it, and re-open it.
is it just a really weird bug?
You can try the same thing above with another sentence "this app can break"
Explanation for Magic #1:
In windows the folder name and the special system variables share the same interface, so when you create a folder with a system variable name it will consider that folder already exist!!
these special system variables are available irrespective of path
You cannot create a folder with these names also:
CON, NUL, COM1, COM2, COM3, LPT1, LPT2, LPT3,COM1 to COM9 and LPT1 to LPT9....
CON means console, COM1 means serial port 1, LPT1 means parallel port 1
From Pakistan, Karachi
Hi magic 3 does not work with "this app can break"
=rand(12,22) it generate random text and it is well known random function in excel ..so it not comes under magic category as most of us using it from several years///
From India, Delhi
=rand(12,22) it generate random text and it is well known random function in excel ..so it not comes under magic category as most of us using it from several years///
From India, Delhi
Here is an explanation to the amazing bugs/facts about Microsoft Windows, often posted to online forums and blogs and also travels via email. Many people don’t know the secrets behind these magics
nobody can create a FOLDER anywhere on the computer which can be named as "CON". This is something pretty cool...and unbelievable. ..
Explanation: How to create CON folder in windows XP
Many ppl don’t know that they cannot create "CON" folder in windows. (Type 1)
Some ppl don’t know why they can’t create it? (Type 2)
Very few know that they can still create it someway.. but dunno why are they supposed to do exactly like that..(Type 3)
Now, After reading this , you will become one of the rest [Bog Grin]
Type 1 :
Try out creating a folder named CON or LPT or COM1
Now, you have become Type 2 category.
Type 2 :
Not only CON, we cannot create any of these
CON, PRN, AUX, CLOCK$, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 and more
The reason is that con, prn, lpt1..lpt9, etc are underlying devices from the time DOS was written. So if u r allowed to create such folders, there will be an ambiguity in where to write data when the data is supposed to go to the specified devices. In other words, if i want to print something, internally what windows does is -- it will write the data to the folder prn (virtually u can call it a folder, i mean prn, con, etc are virtual folders in device level). So if we are able to create con folder, windows will get confused where to write the data, to virtual con folder or real one.
So Now, Try this...
The smart way!
Open the Command prompt by Start -> Run and typing cmd
Code:
C:\> md \\.\c:\con
Now, Open My Computer and browse through the path where you created CON folder... Surprising.. ?? Yeah.. you have created it successfully
Now, try to delete the folder from My computer
OOPS!!! You cant delete it...
Now, try this in command prompt console
Code:
C:\> rd \\.\c:\con
Yeah!! You did it...
Type 3 :
Well, let us now have a glance at how we were able to create it...
It is just because of the UNC Path. The Universal Naming Convention, or UNC, specifies a common syntax to describe the location of a network resource, such as a shared file, directory, or printer. Since, these conventions did n't exist under pure DOS, they are not backward compatible. The UNC syntax for Windows systems is as follows..
\\RemoteHost\sharedfolder\resource
where RemoteHost is the computer name or IP address of the computer that you wish to connect through remotely for accessing shared folder. The rest is the path.
(Here \\remotehost\drive:\con doesn't make sense anyway, because without having a process on the remote host, there is no current 'console'). It would be a security hazard as well, having the serial and parallel ports accessible for everyone who is allowed to read or write in any single directory.
The "." in the command \\.\c:\con suggest the local server. Now, you are pointing to your own computer. since, you have all privileges on every folder of ur computer, you can easily create it.
Here is another simple way:
md c:\con\
rmdir /s c:\con\
Make sure when you want to delete the con folder you place the \ on the end of con or it will think you want to delete a file instead. This is the only thing that I have found that truly works for me when I have to delete con.
The answer to everyone's questions...... what are reserved folder names and why are they reserved?
Several special file names are reserved by the system and cannot be used for files or folders, these are:
CON, AUX, COM1, COM2, COM3, COM4, LPT1, LPT2, LPT3, PRN, NUL
These are special keywords used in DOS and their use may accidentally cause problems with your system... here are their keyword descriptions;
CON -Console
PRN -Printer, usually a parallel port
AUX -Auxiliary device, usually a serial port
CLOCK$ -System real-time clock
NUL -Null device
COM1 -First serial communications port
COM2 -Second serial communications port
COM3- Third serial communications port
COM4 -Fourth serial communications port
LPT1- First parallel printer port
LPT2 -Second parallel printer port
LPT3 -Third parallel printer port
These are called short file names. Short file names have the 8.3 format and are compatible with MS-DOS and other legacy operating systems. They allow the windows OS to communicate with DOS or basically give the OS the ability to access files on a volume. Playing with these file names and creating garbage folders like you all are trying to do may cause your system to crash, so if I were you, I would not continue playing around with that sort of stuff. But that's just my own opinion backed by facts. So, good luck.
Type 4 :
Of course, Now, u r of type 4. What else I can say
----------------------------------------------------------------------------
Explanation for Magic#2:
It says "the quick brown fox jumps over the lazy dog" in 200 paragraphs, with each paragraph saying it 99 times. It does this so that people can test out fonts and markups on their computer, as "the quick brown fox jumps over the lazy dog" uses every 26 letter from the alphabet.
It looks like an Easter egg, but it is only a not-well documented feature of MS Word. If you would like to insert a dummy text in to a document using MS Word, you can do so by typing =rand() and pressing ENTER. You can also pass variables to the rand() function, rand(p,s), where p is the number of paragraphs and s is the number of sentences that you want to appear in each paragraph. Neat, eh?
This feature is turned on by default, and is disabled when the Replace text as you type option is turned off. To turn this option on or off, click AutoCorrect on the Tools menu, click the AutoCorrect tab, and then click to select or clear the Replace text as you type check box.
Note Word will not insert sample text when the insertion point immediately follows either a PAGE BREAK or a COLUMN BREAK.
MORE INFORMATION
By default, the sample text contains three paragraphs, with each paragraph containing five sentences. You can control how many paragraphs and sentences appear by adding numbers inside the parentheses.
The =rand() function has the following syntax
=rand(p,s)
where p is the number of paragraphs and s is the number of sentences that you want to appear in each paragraph.
Examples:
=rand(1) inserts one five-sentence paragraph of text
=rand(1,1) inserts one one-sentence paragraph of text.
=rand(1,2) inserts one two-sentence paragraph of text
=rand(2) inserts two five-sentence paragraphs of text
=rand(2,1) inserts two one-sentence paragraphs of text
=rand(10) inserts 10 five-sentence paragraphs of text
=rand(10,1) inserts 10 one-sentence paragraphs of text
=rand(10,10) inserts 10 ten-sentence paragraphs of text
Note When you omit the second number, the default is five sentences of text. The maximum number that can be used inside the parenthesis is 200 (this number may be lower depending on the number of paragraphs and sentences specified).
Ref:
http://support.microsoft.com/kb/212251/en-us
----------------------------------------------------------------------
Explanation: Bush hid the facts
Bush hid the facts is the common name for a bug found in the Microsoft Windows XP version of Notepad. It is sometimes referred to as an Easter egg but is not an intentional addition to the program.
While "Bush hid the facts" is the sentence most forwarding around the Internet, the bug does not exclusively occur with that phrase. The bug will manifest itself with many sentences of a particular structure: One word with 4 letters, two or more words with 3 letters, one word with 5 letters. Except for the starting character all the letters must be lower case. Other phrases that will expose the same bug are: "Bill can not dance", "John has the parts", "this app can break", and "Feel the new power".
This little Windows Notepad "trick" is often posted to online forums and blogs and also travels via email. The bug appears when such a string is entered into the Windows XP or Windows NT/2000 versions of Notepad (with no other characters) and then saved as a text file as instructed above, the re-opened file displays nine Chinese characters (or squares if the language pack has not been installed).
The first image below shows the text before closing the Notepad file. The second image shows the text as it is displayed after the file is re-opened:
Bush hid the facts before closingBush hid the facts after re-opening
Some of the more wide-eyed conspiracy theorists postulate that this result is a form of political commentary directed against US President Bush and was knowingly and deliberately programmed into Notepad by Microsoft.
Alas, the truth is far less compelling. It appears that a lot of other character strings in the pattern 4 letters, 3 letters, 3 letters and 5 letters will give the same result. For example, the phrase "Bill fed the goats" also displays the garbled text as shown below:
Bill fed the goats before closingBill fed the goats after re-opening
In fact, even a line of text such as "hhhh hhh hhh hhhhh" will elicit the same results.
Since I first published this article, a few readers have pointed out that some character strings that fit the "4,3,3,5" pattern do not generate the error. For example, the phrase "Bush hid the truth" is displayed normally. However, conspiracy theorists should not take this as aiding their argument. "Fred led the brats", "brad ate the trees" and other strings also escape the error.
Thus, any hint of political conspiracy fades into oblivion and is replaced by a rather mundane programming bug. It seems probable that a certain combination and/or frequency of letters in the character string cause Notepad to misinterpret the encoding of the file when it is re-opened. If the file is originally saved as "Unicode" rather than "ANSI" the text displays correctly. Older versions of Notepad such as those that came with Windows 95, 98 or ME do not include Unicode support so the error does not occur.
So, nothing weird here at all...except perhaps for the fact that someone, somewhere had nothing better to do than turn a simple software glitch into another lame conspiracy theory. Smile
Discovery
The bug only first appeared in Windows XP, which meant it was not discovered immediately for two reasons. Earlier Microsoft systems are much richer in Easter eggs and so they receive far more attention from people looking for secrets. Microsoft formally stopped the inclusion of easter eggs in their systems because of fears that they may be used as cover for a Logic Bomb. This meant very few people were actively looking for XP Easter eggs. It was discovered during the summer of 2006 and has since risen to prevalence in internet blogs and chat rooms.
Notepad misinterprets the encoding of the file when it is re-opened. If the file is originally saved as "Unicode" rather than "ANSI" the text displays correctly. Older versions of Notepad such as those that came with Windows 95, 98 or ME do not include Unicode support so the error does not occur. The version of Notepad included with Windows Vista has been corrected to prevent this error.
Some files come up strange in Notepad
The reason is that Notepad has to edit files in a variety of encodings, and when it’s back against the wall, sometimes it's forced to guess.
Here's the file "Hello" in various encodings:
48 65 6C 6C 6F
This is the traditional ANSI encoding.
48 00 65 00 6C 00 6C 00 6F 00
This is the Unicode (little-endian) encoding with no BOM.
FF FE 48 00 65 00 6C 00 6C 00 6F 00
This is the Unicode (little-endian) encoding with BOM. The BOM (FF FE) serves two purposes: First, it tags the file as a Unicode document, and second, the order in which the two bytes appear indicate that the file is little-endian.
00 48 00 65 00 6C 00 6C 00 6F
This is the Unicode (big-endian) encoding with no BOM. Notepad does not support this encoding.
FE FF 00 48 00 65 00 6C 00 6C 00 6F
This is the Unicode (big-endian) encoding with BOM. Notice that this BOM is in the opposite order from the little-endian BOM.
EF BB BF 48 65 6C 6C 6F
This is UTF-8 encoding. The first three bytes are the UTF-8 encoding of the BOM.
2B 2F 76 38 2D 48 65 6C 6C 6F
This is UTF-7 encoding. The first five bytes are the UTF-7 encoding of the BOM. Notepad doesn't support this encoding.
Notice that the UTF7 BOM encoding is just the ASCII string "+/v8-", which is difficult to distinguish from just a regular file that happens to begin with those five characters (as odd as they may be).
The encodings that do not have special prefixes and which are still supported by Notepad are the traditional ANSI encoding (i.e., "plain ASCII") and the Unicode (little-endian) encoding with no BOM. When faced with a file that lacks a special prefix, Notepad is forced to guess which of those two encodings the file actually uses. The function that does this work is IsTextUnicode, which studies a chunk of bytes and does some statistical analysis to come up with a guess.
And as the documentation notes, "Absolute certainty is not guaranteed." Short strings are most likely to be misdetected.
What happens is..
Whenever we open a text file in notepad, it determines whether the text is ASCII or Unicode. But actually there is no way in the world to exactly determine whether the text is ASCII or Unicode. There is a function called IsTextUnicode in windows API.. The function does some tests on the text based on statistical data.. This is what MSDN says about the function..
"The function uses various statistical and deterministic methods to make its determination.. These tests are not foolproof. The statistical tests assume certain amounts of variation between low and high bytes in a string, and some ASCII strings can slip through. For example, if lpBuffer points to the ASCII string 0x41, 0x0A, 0x0D, 0x1D (A\n\r^Z), the string passes the IS_TEXT_UNICODE_STATISTICS test, though failure would be preferable."
And the book "Programming application for Microsoft windows by Jeffrey Ritcher" says that this function does not give accurate results if the text is too small to do the tests.
Oohhh. Enough techie stuffs..
If u open the file explicitly in ASCII mode in notepad, you can see the text correctly.
There can be only 2 reasons why notepad shows the file correctly through wine. One is there may be no unicode support in wine.. or wine has a much better statistical and deterministic methods than Microsoft to determine the text is ASCII or Unicode. But I don’t think the second one is possible.
:roll:
From India, Kochi
nobody can create a FOLDER anywhere on the computer which can be named as "CON". This is something pretty cool...and unbelievable. ..
Explanation: How to create CON folder in windows XP
Many ppl don’t know that they cannot create "CON" folder in windows. (Type 1)
Some ppl don’t know why they can’t create it? (Type 2)
Very few know that they can still create it someway.. but dunno why are they supposed to do exactly like that..(Type 3)
Now, After reading this , you will become one of the rest [Bog Grin]
Type 1 :
Try out creating a folder named CON or LPT or COM1
Now, you have become Type 2 category.
Type 2 :
Not only CON, we cannot create any of these
CON, PRN, AUX, CLOCK$, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 and more
The reason is that con, prn, lpt1..lpt9, etc are underlying devices from the time DOS was written. So if u r allowed to create such folders, there will be an ambiguity in where to write data when the data is supposed to go to the specified devices. In other words, if i want to print something, internally what windows does is -- it will write the data to the folder prn (virtually u can call it a folder, i mean prn, con, etc are virtual folders in device level). So if we are able to create con folder, windows will get confused where to write the data, to virtual con folder or real one.
So Now, Try this...
The smart way!
Open the Command prompt by Start -> Run and typing cmd
Code:
C:\> md \\.\c:\con
Now, Open My Computer and browse through the path where you created CON folder... Surprising.. ?? Yeah.. you have created it successfully
Now, try to delete the folder from My computer
OOPS!!! You cant delete it...
Now, try this in command prompt console
Code:
C:\> rd \\.\c:\con
Yeah!! You did it...
Type 3 :
Well, let us now have a glance at how we were able to create it...
It is just because of the UNC Path. The Universal Naming Convention, or UNC, specifies a common syntax to describe the location of a network resource, such as a shared file, directory, or printer. Since, these conventions did n't exist under pure DOS, they are not backward compatible. The UNC syntax for Windows systems is as follows..
\\RemoteHost\sharedfolder\resource
where RemoteHost is the computer name or IP address of the computer that you wish to connect through remotely for accessing shared folder. The rest is the path.
(Here \\remotehost\drive:\con doesn't make sense anyway, because without having a process on the remote host, there is no current 'console'). It would be a security hazard as well, having the serial and parallel ports accessible for everyone who is allowed to read or write in any single directory.
The "." in the command \\.\c:\con suggest the local server. Now, you are pointing to your own computer. since, you have all privileges on every folder of ur computer, you can easily create it.
Here is another simple way:
md c:\con\
rmdir /s c:\con\
Make sure when you want to delete the con folder you place the \ on the end of con or it will think you want to delete a file instead. This is the only thing that I have found that truly works for me when I have to delete con.
The answer to everyone's questions...... what are reserved folder names and why are they reserved?
Several special file names are reserved by the system and cannot be used for files or folders, these are:
CON, AUX, COM1, COM2, COM3, COM4, LPT1, LPT2, LPT3, PRN, NUL
These are special keywords used in DOS and their use may accidentally cause problems with your system... here are their keyword descriptions;
CON -Console
PRN -Printer, usually a parallel port
AUX -Auxiliary device, usually a serial port
CLOCK$ -System real-time clock
NUL -Null device
COM1 -First serial communications port
COM2 -Second serial communications port
COM3- Third serial communications port
COM4 -Fourth serial communications port
LPT1- First parallel printer port
LPT2 -Second parallel printer port
LPT3 -Third parallel printer port
These are called short file names. Short file names have the 8.3 format and are compatible with MS-DOS and other legacy operating systems. They allow the windows OS to communicate with DOS or basically give the OS the ability to access files on a volume. Playing with these file names and creating garbage folders like you all are trying to do may cause your system to crash, so if I were you, I would not continue playing around with that sort of stuff. But that's just my own opinion backed by facts. So, good luck.
Type 4 :
Of course, Now, u r of type 4. What else I can say
----------------------------------------------------------------------------
Explanation for Magic#2:
It says "the quick brown fox jumps over the lazy dog" in 200 paragraphs, with each paragraph saying it 99 times. It does this so that people can test out fonts and markups on their computer, as "the quick brown fox jumps over the lazy dog" uses every 26 letter from the alphabet.
It looks like an Easter egg, but it is only a not-well documented feature of MS Word. If you would like to insert a dummy text in to a document using MS Word, you can do so by typing =rand() and pressing ENTER. You can also pass variables to the rand() function, rand(p,s), where p is the number of paragraphs and s is the number of sentences that you want to appear in each paragraph. Neat, eh?
This feature is turned on by default, and is disabled when the Replace text as you type option is turned off. To turn this option on or off, click AutoCorrect on the Tools menu, click the AutoCorrect tab, and then click to select or clear the Replace text as you type check box.
Note Word will not insert sample text when the insertion point immediately follows either a PAGE BREAK or a COLUMN BREAK.
MORE INFORMATION
By default, the sample text contains three paragraphs, with each paragraph containing five sentences. You can control how many paragraphs and sentences appear by adding numbers inside the parentheses.
The =rand() function has the following syntax
=rand(p,s)
where p is the number of paragraphs and s is the number of sentences that you want to appear in each paragraph.
Examples:
=rand(1) inserts one five-sentence paragraph of text
=rand(1,1) inserts one one-sentence paragraph of text.
=rand(1,2) inserts one two-sentence paragraph of text
=rand(2) inserts two five-sentence paragraphs of text
=rand(2,1) inserts two one-sentence paragraphs of text
=rand(10) inserts 10 five-sentence paragraphs of text
=rand(10,1) inserts 10 one-sentence paragraphs of text
=rand(10,10) inserts 10 ten-sentence paragraphs of text
Note When you omit the second number, the default is five sentences of text. The maximum number that can be used inside the parenthesis is 200 (this number may be lower depending on the number of paragraphs and sentences specified).
Ref:
http://support.microsoft.com/kb/212251/en-us
----------------------------------------------------------------------
Explanation: Bush hid the facts
Bush hid the facts is the common name for a bug found in the Microsoft Windows XP version of Notepad. It is sometimes referred to as an Easter egg but is not an intentional addition to the program.
While "Bush hid the facts" is the sentence most forwarding around the Internet, the bug does not exclusively occur with that phrase. The bug will manifest itself with many sentences of a particular structure: One word with 4 letters, two or more words with 3 letters, one word with 5 letters. Except for the starting character all the letters must be lower case. Other phrases that will expose the same bug are: "Bill can not dance", "John has the parts", "this app can break", and "Feel the new power".
This little Windows Notepad "trick" is often posted to online forums and blogs and also travels via email. The bug appears when such a string is entered into the Windows XP or Windows NT/2000 versions of Notepad (with no other characters) and then saved as a text file as instructed above, the re-opened file displays nine Chinese characters (or squares if the language pack has not been installed).
The first image below shows the text before closing the Notepad file. The second image shows the text as it is displayed after the file is re-opened:
Bush hid the facts before closingBush hid the facts after re-opening
Some of the more wide-eyed conspiracy theorists postulate that this result is a form of political commentary directed against US President Bush and was knowingly and deliberately programmed into Notepad by Microsoft.
Alas, the truth is far less compelling. It appears that a lot of other character strings in the pattern 4 letters, 3 letters, 3 letters and 5 letters will give the same result. For example, the phrase "Bill fed the goats" also displays the garbled text as shown below:
Bill fed the goats before closingBill fed the goats after re-opening
In fact, even a line of text such as "hhhh hhh hhh hhhhh" will elicit the same results.
Since I first published this article, a few readers have pointed out that some character strings that fit the "4,3,3,5" pattern do not generate the error. For example, the phrase "Bush hid the truth" is displayed normally. However, conspiracy theorists should not take this as aiding their argument. "Fred led the brats", "brad ate the trees" and other strings also escape the error.
Thus, any hint of political conspiracy fades into oblivion and is replaced by a rather mundane programming bug. It seems probable that a certain combination and/or frequency of letters in the character string cause Notepad to misinterpret the encoding of the file when it is re-opened. If the file is originally saved as "Unicode" rather than "ANSI" the text displays correctly. Older versions of Notepad such as those that came with Windows 95, 98 or ME do not include Unicode support so the error does not occur.
So, nothing weird here at all...except perhaps for the fact that someone, somewhere had nothing better to do than turn a simple software glitch into another lame conspiracy theory. Smile
Discovery
The bug only first appeared in Windows XP, which meant it was not discovered immediately for two reasons. Earlier Microsoft systems are much richer in Easter eggs and so they receive far more attention from people looking for secrets. Microsoft formally stopped the inclusion of easter eggs in their systems because of fears that they may be used as cover for a Logic Bomb. This meant very few people were actively looking for XP Easter eggs. It was discovered during the summer of 2006 and has since risen to prevalence in internet blogs and chat rooms.
Notepad misinterprets the encoding of the file when it is re-opened. If the file is originally saved as "Unicode" rather than "ANSI" the text displays correctly. Older versions of Notepad such as those that came with Windows 95, 98 or ME do not include Unicode support so the error does not occur. The version of Notepad included with Windows Vista has been corrected to prevent this error.
Some files come up strange in Notepad
The reason is that Notepad has to edit files in a variety of encodings, and when it’s back against the wall, sometimes it's forced to guess.
Here's the file "Hello" in various encodings:
48 65 6C 6C 6F
This is the traditional ANSI encoding.
48 00 65 00 6C 00 6C 00 6F 00
This is the Unicode (little-endian) encoding with no BOM.
FF FE 48 00 65 00 6C 00 6C 00 6F 00
This is the Unicode (little-endian) encoding with BOM. The BOM (FF FE) serves two purposes: First, it tags the file as a Unicode document, and second, the order in which the two bytes appear indicate that the file is little-endian.
00 48 00 65 00 6C 00 6C 00 6F
This is the Unicode (big-endian) encoding with no BOM. Notepad does not support this encoding.
FE FF 00 48 00 65 00 6C 00 6C 00 6F
This is the Unicode (big-endian) encoding with BOM. Notice that this BOM is in the opposite order from the little-endian BOM.
EF BB BF 48 65 6C 6C 6F
This is UTF-8 encoding. The first three bytes are the UTF-8 encoding of the BOM.
2B 2F 76 38 2D 48 65 6C 6C 6F
This is UTF-7 encoding. The first five bytes are the UTF-7 encoding of the BOM. Notepad doesn't support this encoding.
Notice that the UTF7 BOM encoding is just the ASCII string "+/v8-", which is difficult to distinguish from just a regular file that happens to begin with those five characters (as odd as they may be).
The encodings that do not have special prefixes and which are still supported by Notepad are the traditional ANSI encoding (i.e., "plain ASCII") and the Unicode (little-endian) encoding with no BOM. When faced with a file that lacks a special prefix, Notepad is forced to guess which of those two encodings the file actually uses. The function that does this work is IsTextUnicode, which studies a chunk of bytes and does some statistical analysis to come up with a guess.
And as the documentation notes, "Absolute certainty is not guaranteed." Short strings are most likely to be misdetected.
What happens is..
Whenever we open a text file in notepad, it determines whether the text is ASCII or Unicode. But actually there is no way in the world to exactly determine whether the text is ASCII or Unicode. There is a function called IsTextUnicode in windows API.. The function does some tests on the text based on statistical data.. This is what MSDN says about the function..
"The function uses various statistical and deterministic methods to make its determination.. These tests are not foolproof. The statistical tests assume certain amounts of variation between low and high bytes in a string, and some ASCII strings can slip through. For example, if lpBuffer points to the ASCII string 0x41, 0x0A, 0x0D, 0x1D (A\n\r^Z), the string passes the IS_TEXT_UNICODE_STATISTICS test, though failure would be preferable."
And the book "Programming application for Microsoft windows by Jeffrey Ritcher" says that this function does not give accurate results if the text is too small to do the tests.
Oohhh. Enough techie stuffs..
If u open the file explicitly in ASCII mode in notepad, you can see the text correctly.
There can be only 2 reasons why notepad shows the file correctly through wine. One is there may be no unicode support in wine.. or wine has a much better statistical and deterministic methods than Microsoft to determine the text is ASCII or Unicode. But I don’t think the second one is possible.
:roll:
From India, Kochi
Find answers from people who have previously dealt with business and work issues similar to yours - Please Register and Log In to CiteHR and post your query.