Quantcast
Channel: prgmr.io » casting
Viewing all articles
Browse latest Browse all 5

Getting the ASCII/ UTF-8 value of a string

$
0
0

To display the character associated with an ASCII/ UTF-8 code do the following:

PS> [char]97
a

Thanks to this PowerTip.

Now how about the reverse? Can I get the ASCII/ UTF-8 code of a character.

PS> [int]'a'
Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format."
At line:1 char:1
+ [int]'a'
+ ~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvalidCastFromStringToInteger
# Bummer! How about if I set it in a variable and then try?
PS> $a="a"
PS> [int]$a
Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format."
At line:1 char:1
+ [int]$a
+ ~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvalidCastFromStringToInteger
# Maybe the double quotes were causing the letter to be interpreted as a string rather than a character?
PS> $a='a'
PS> [int]$a
Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format."
At line:1 char:1
+ [int]$a
+ ~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvalidCastFromStringToInteger

No luck. But I think am the right track.

The error InvalidCastFromStringToInteger is what tells me PowerShell is trying to cast from a [string] to [int] – which is not what I want and will obviously fail. I want to cast from a [char] to [int] so let’s be explicit about that.

PS> [int][char]'a'
97

Good, that works!

Now how about getting the ASCII/ UTF-8 of a string. Can I do that?

As expected, you can’t just pass two characters and hope it works!

PS> [int][char]'as'
Cannot convert value "as" to type "System.Char". Error: "String must be exactly one character long."
At line:1 char:1
+ [int][char]'as'
+ ~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvalidCastParseTargetInvocation

What I need is an array of [char] elements. Which I can then type cast to an array of [int] elements.

First let’s look whether there’s any method available to convert a string to an array of characters?

PS> "abc" | gm
   TypeName: System.String
Name             MemberType            Definition
----             ----------            ----------
...
ToChar           Method                char IConvertible.ToChar(System.IFormatProvider provider)
ToCharArray      Method                char[] ToCharArray(), char[] ToCharArray(int startIndex, int length)
...

Looks like there is. Does the following work?

PS> "abc".ToCharArray()
a
b
c
PS> [char]"abc".ToCharArray()
Cannot convert the "System.Char[]" value of type "System.Char[]" to type "System.Char".
At line:1 char:1
+ [char]"abc".ToCharArray()
+ ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : ConvertToFinalInvalidCastException

No, but that gives me a hint on the solution. The output of the ToCharArray() method is of the data type System.Char[] whereas [char] is shorthand for the System.Char data type.

PS> [char] | gm -Static
   TypeName: System.Char
Name               MemberType Definition
----               ---------- ----------
ConvertFromUtf32   Method     static string ConvertFromUtf32(int utf32)
...

So maybe [char[]] is what I need? Does such a data type exist?

PS> [char[]] | gm -Static
   TypeName: System.Char[]
Name            MemberType Definition
----            ---------- ----------
AsReadOnly      Method     static System.Collections.ObjectModel.ReadOnlyCollection[T] AsReadOnly[T](T[] array)
...

Sure enough it does!

So let’s try the following:

PS> [char[]]"abc".ToCharArray()
a
b
c
PS> [char[]]"abc"
a
b
c

I don’t need the ToCharArray() method either as if I just type cast a string to an array of characters the method is invoked implicitly. Sweet!

Armed with this info I try type casting the string to an array of integers to get their ASCII/ UTF-8 values:

PS> [int[]][char[]]"abc"
97
98
99

Nice!

Can I make this better? As in, say I had a longish string; currently the above snippet just gives a bunch of codes and that’s not very helpful if I want to see the code of a particular letter. Can I get the output such that it shows each character followed by it’s ASCII/ UTF-8 code? Something like this:

PS> [char[]]"abc" | %{ "$_ -> [int]$_" }
a -> [int]a
b -> [int]b
c -> [int]c

D’oh! Doesn’t help. But I am on the right track, and I know what to do. You see, within double quotes the [int] is not evaluated (thanks to this Hey, Scripting Guy! post) and so I have to force evaluation through any one of the methods mentioned in that post. I prefer the VBScript approach, so here goes:

PS> [char[]]"abc" | %{ "$_ -> " + [int]$_" }
a -> 97
b -> 98
c -> 99
PS> [char[]]"i am a longish string" | %{ "$_ -> " + [int]$_ }
i -> 105
  -> 32
a -> 97
m -> 109
  -> 32
a -> 97
  -> 32
l -> 108
o -> 111
n -> 110
g -> 103
i -> 105
s -> 115
h -> 104
  -> 32
s -> 115
t -> 116
r -> 114
i -> 105
n -> 110
g -> 103

Bingo!


Viewing all articles
Browse latest Browse all 5

Trending Articles