In this post , we will see how to handle
encoding and decoding of data in javascript . Recently , I worked on a project
where I had to read paragraphs of data stored in a database column which can be
of type VARCHAR2 or CLOB . This data
needs to be transmitted as JSON and then after some processing , should be displayed
on a web page . This data had some special characters like à stored
in the database .When i rendered this content in a browser surprisingly I saw some weird characters instead of the character à . Debugging this issue seemed like a nightmare before
understanding the encoding and decoding concepts . So let’s understand these
concepts first and then see a solution to such problems .
As
you all know , a computer cannot store "letters",
"numbers", "pictures" or anything else. The only thing it
can store and work with are bits.
A bit can only have two values: yes or no, true or false, 1 or 0
. To use bits to represent anything at all besides bits, we need rules. We need
to convert a sequence of bits into something like letters, numbers and pictures
using an encoding scheme,
or encoding for
short.
The below encoding scheme
happens to be ASCII. A string of 1s and 0s is broken down into parts
of eight bit each (a byte for
short). The ASCII encoding specifies a table translating bytes into human
readable letters. Here's a short excerpt of that table:
The ASCII encoding encompasses a character set of 128 characters. Since this charset doesn’t cover all the symbols used in different languages , several charsets were invented to cover most of them and they have become countless over time.
All you need to know is :
data may be saved using any encoding scheme . But to be able to read it
correctly you will have to know what encoding scheme was used so that you can decode it
accordingly . Yes , just remember this whenever you are dealing with text or
any content as a developer. Use a specific encoding and decoding system to
transmit and read data .
Base64 is a group of such similar binary-to-text
encoding schemes that represent binary data in an ASCII string format
by translating it into a radix-64 representation. The term Base64 originates from a
specific MIME
content transfer encoding. Base64 encoding schemes are commonly used when
there is a need to encode binary data that needs to be stored and transferred
over media that are designed to deal with textual data. This is to ensure that
the data remain intact without modification during transport.
Coming to JavaScript there are two functions
respectively for decoding and encoding base64strings:
The atob() function
decodes a string of data which has been encoded using base-64 encoding.
Conversely, the btoa() function creates a base-64 encoded ASCII
string from a "string" of binary data.Both atob() and btoa() work
on strings.
However just simply using
these functions did not help me and few
special characters still were not readable on my web page .
The "Unicode
Problem"
Since DOMStrings are 16-bit-encoded
strings, in most browsers just calling window.btoa on a Unicode string
will cause a Character Out Of Range exception if a character exceeds
the range of a 8-bit byte (0x00~0xFF). Please refer the documentation
for more details on this .
One solution to this is
to escape the whole string (with UTF-8, see encodeURIComponent)
and then encode it;
function b64EncodeUnicode(str) {
// first we use
encodeURIComponent to get percent-encoded UTF-8,
// then we convert the percent
encodings into raw bytes which
// can be fed into btoa.
return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
function toSolidBytes(match, p1) {
return String.fromCharCode('0x' + p1);
}));
}
b64EncodeUnicode('✓ à la mode'); //
"4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="
To decode the
Base64-encoded value back into a String:
functionb64DecodeUnicode(
str
){
// Going backwards: from bytestream, to percent-encoding, to original string.
return
decodeURIComponent(atob(
str
).split('').map(function(c
){
return
'%'
+
('00'
+
c
.charCodeAt(0).toString(16)).slice(-2);
}).join(''));
}
b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=');// "✓ à la mode"
b64DecodeUnicode('Cg==');// "\n"
If you want to do the
encoding in PLSQL before transmitting the data to the client side like I had to
, you can use base64 encoding in SQL . I
will share a post on that soon. Please subscribe to my blog for all updates .
Your blog is wonderful
ReplyDeleteThank you !
DeleteYour blog is phenomenal !!
ReplyDeletenice information..thanks for providing valuable information.website design in india
ReplyDeletelow cost web design services