Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).
Make sure you fill in any place that says YOUR CODE HERE
or "YOUR ANSWER HERE".
Q1 One of the reasons for the existence of Unicode is its ability to use strings that go beyond the limitations of the keyboard. Relative to the discussion in the chapter, Unicode is about the strings we can use in our programs, and the issue of how they translate/map to a sequence of bytes (i.e. their encoding) is a separate concept.
When we have the code-point (generally a hex digit sequence identifying an index into the set of characters) for a Unicode character that is beyond our normal keyboard characters, we can include them in our strings by using the \u
escape prefix followed by the hex digits for the code-point. Consider the Python string s:
s = "Unicode examples: \u2B2C and \u266A and \u1F60 and " \
"\u265E and \u0394 and \u0402"
Write code to print s
, then assign to b8
the UTF-8 encoding of s
, and b16
the UTF-16BE encoding of s
. For each, use the hex()
method of the bytes
data type to see a hex version of the encoded values. Use the following code cell for your Python sequence. Then, in the subsequent Markdown cell, answer the following questions:
b8
, b16
, and for the two hex()
transformations.s
?# YOUR CODE HERE
raise NotImplementedError()
YOUR ANSWER HERE
Q2 Write a function
shiftLetter(letter, n)
whose parameter, letter
should be a single character. If the character is between "A"
and "Z"
, the function returns an upper case character $n$ positions further along, and "wrapping" if the + $n$ mapping goes past "Z"
. Likewise, it should map the lower case characters between "a"
and "z"
. If the parameter letter
is anything else, or not of length 1, the function should return letter
.
Hint: review functions ord()
and chr()
from the section, as well as the modulus operator %
.
# YOUR CODE HERE
raise NotImplementedError()
assert True
Q3 Building on the previous exercise, write a function
encrypt(plaintext, n)
that performs a shiftLetter
for each of the letters in plaintext
and accumulates and returns the resultant string.
# YOUR CODE HERE
raise NotImplementedError()
assert True
Q4 Write a function
singleByteChars(s)
that takes its argument, s
, and determines whether or not all the characters in s
can be encoded by a single byte. The function. should return the Boolean True
if so, and False
otherwise.
# YOUR CODE HERE
raise NotImplementedError()
print(singleByteChars("Hello \u2B2C"))
singleByteChars("Hello")
assert True
Q5 Suppose you have, in your Python program, a variable that refers to a bytes
data type, like mystery
refers to the bytes
constant literal as given here:
mystery = b'\xc9\xa2\x95}\xa3@\x89\xa3@\x87\x99\x85'\
b'\x81\xa3@\xa3\x96@\x82\x85@\xa2\x96\x93'\
b'\xa5\x89\x95\x87@\x97\x99\x96\x82\x93\x85'\
b'\x94\xa2o@@\xe8\x96\xa4@\x82\x85\xa3Z'
Perhaps this value came from a network message, or from a file. But you suspect that it, in fact, holds the bytes for a character string, and you need to figure out how it is encoding. Assume that you have narrowed the encodings down to one of the following:
Write code to convert the byte sequence to a character string, and determine the correct encoding. By the end of your code, assign to s
the "correct" decoding translation.
# YOUR CODE HERE
raise NotImplementedError()
assert True