Posted on 2019-06-25.
Below I explain a few progressively-more-efficient solutions to the “Special Palindrome Again” problem. Note that they seem to have recently renamed it to “Special String Again” - I think that’s good, since the problem isn’t about palindromes. If you see references to “special palindrome” in discussions of this problem, just know that it means the same thing as what’s now referred to as a “special string”.
My solutions are written in Kotlin.
I’ll just show how to implement the substrCount
function; for the main
function, I’m using the template HackerRank provides.
If you just want hints, without the full solution, see: June KC Code Challenge Hints.
Table of Contents:
A straightforward way to approach this problem is to break it into two subproblems:
Then you can easily check whether each substring is special, and record the total count of how many are.
The problem says a string counts as “special” if either all its characters are the same, or all the characters except the one in the middle are the same.
An algorithm to test for specialness would be:
Here’s a function for finding the middle index, which returns -1
if there is no middle index:
fun middleIndex(s: String) = if (s.length % 2 == 1) s.length / 2 else -1
That says, simply, if the string has an odd-numbered length (the remainder is 1 when you divide its length by 2), then the middle index is the length divided by 2.
Using that, we can write our function to test whether a string is special:
fun isSpecial(s: String) = s.withIndex().all { it.value == s[0] || it.index == middleIndex(s) }
withIndex
allows us to operate on a list containing each character of the string along with that character’s index in the string.
all
tests each element of that list against a condition that we provide, and returns true
if the condition returns true
for all - not just some - of the elements.
Our condition says that either the character must be equal to the first character of the string, or else this must be the middle index.
For the string “aab”, the list of all substrings is ["a", "aa", "aab", "a", "ab", "b"]
.
The algorithm to generate that is:
Here’s code to do that, written in a functional style:
fun substrings(s: String) = (0 until s.length).asSequence().flatMap { i -> ((i + 1)..s.length).asSequence().map { j -> s.substring(i, j) } }
Some things to notice:
asSequence()
lets us operate on collections lazily.
This means that when you call substrings("aab")
, it won’t immediately construct a six-element list.
Rather, each time you ask for another element from the list, it will generate that element right then.
This way, we don’t actually have to keep all the substrings in memory at once, and our solution can have a space complexity of O(1).flatMap
says that, for each entry in a given list (in this case, the list of indexes from 0 to length - 1), we’re going to generate a list of new elements, but we want them all combined into one output list.
If we used map
instead of flatMap
here, the output of substrings("aab")
would be [["a", "aa", "aab"], ["a", "ab"], ["b"]]
.Now, we just have to count how many substrings are special:
fun substrCount(n: Int, s: String) = substrings(s).count(::isSpecial).toLong()
This says: find all substrings, then apply the isSpecial
function to each of them, and count how many of them it returned true
for.
Given a string of length N, the substrings
function will generate a list that contiains N + (N - 1) + (N - 2) + (N - 3) … 1 elements.
For large numbers, that is much closer to N2 than to N, so for simplicity we’ll say it generates N2 elements.
We call isSpecial
on each of those elements, so the time complexity will be O(N2 * (time complexity of isSpecial
)).
isSpecial
looks once at each character of the string it’s given, and we’ll be giving it strings of up to length N, so we can say its time complexity is O(N).
That means the overall time complexity is O(N3).
Consider how the previous solution handles the input string "aab"
.
The substrings are ["a", "aa", "aab", "a", "ab", "b"]
.
Let’s think about how just the first three, ["a", "aa", "aab"]
, are handled.
They’re all generated from the same starting index, 0, and each contains just one character appended onto the previous.
But each call to isSpecial
will look at every character in the substring again, even though the previous call to isSpecial
already looked at all those characters except for the one new one at the end.
That seems like wasted work; how can we optimize it?
Instead of generating all substrings, let’s just generate the substrings from each start index to the end of the string:
fun tailStrings(s: String) = (0 until s.length).asSequence().map { s.substring(it) }
Given "aab"
, this will return ["aab", "ab", "b"]
.
If we somehow write an O(N) function to determine how many specials a string starts with, then we can pass it each of those and we’ll have the total count of specials in O(N2) time.
For example, if we can efficiently determine that "aab"
starts with two specials - "a"
and "aa"
- we don’t need to worry about the additional special "a"
in the middle, since we’ll detect that one later when we check the string "ab"
.
To find the number of special strings at the start of a string, we need to consider the two types of special string.
Type 1: All the same character.
It’s easy to figure out how many of these type-1 special strings are at the start of a string.
We just loop through the characters until we find one that’s not equal to the first character.
If the string starts with, say, 4 'a'
s, then it starts with four type-1 special strings (["a", "aa", "aaa", "aaaa"]
).
fun leadingCount(s: String, c: Char) = s.takeWhile { it == c }.length fun leadingType1s(s: String) = leadingCount(s, s[0]).toLong()
Type 2: All the same character except the middle.
Once we know how many of the same first character there are at the beginning of the string, we can skip one character, then count out how many times the first character appears in a row after that.
If that’s equal to or greater than the number at the beginning of the string, then there’s one type-2 special string at the start.
For example, "abaac"
starts with one type-2 special string ("aba"
); "aabac"
does not start with a type-2 special string (even though it contains one).
fun leadingType2s(s: String): Long { val beforeMiddle = leadingCount(s, s[0]) val afterMiddle = leadingCount(s.drop(beforeMiddle + 1), s[0]) return if (afterMiddle >= beforeMiddle) 1 else 0 }
Now we can combine the leadingType1s
, leadingType2s
, and tailStrings
functions to find the total number of special substrings:
fun leadingSpecials(s: String) = leadingType1s(s) + leadingType2s(s) fun substrCount(n: Int, s: String) = tailStrings(s).map(::leadingSpecials).sum()
The previous solution is still doing a lot of redundant work.
For example, given "aaab"
, it will count how many leading 'a'
s there are in "aaab"
, and then it will count how many leading 'a'
s there are in "aab"
, and then how many in "ab"
.
It would be much faster if we just went through the string once and built a list of how many times each character occurs in a row.
First, let’s define a data structure to hold a character and a count of its consecutive occurrences:
data class CharCount(val char: Char, val count: Long)
Then let’s write a function that loops through the string and emits a CharCount instance every time it encounters a character different from the previous character:
fun consecutives(s: String) = sequence { var count = 1L for (i in 1..s.length) { if (i == s.length || s[i] != s[i-1]) { yield(CharCount(s[i-1], count)) count = 0L } count++ } }
This function returns a sequence, instead of a List
, to minimize memory usage.
In Kotlin, sequences can be lazily generated, meaning that each element is generated only when you ask for it, rather than generating the whole list up front and storing it in memory.
To understand how this works, let’s walk through what happens when you call consecutives("aabaacc")
.
The resulting sequence will have four elements: [CharCount('a', 2), CharCount('b', 1), CharCount('a', 2), CharCount('c', 2)]
.
This isn’t calculated immediately, though.
The function returns an object of type Sequence<CharCount>
immediately.
Every time we ask for the next element in that sequence, the code in the block we provided to sequence
will be executed until it calls yield
.
The parameter to yield
is what will be used as the next element in the sequence.
So, the first time we ask for an element, count
will be set to 1 and we’ll start iterating through the string.
As soon as i
is 2, we’ll notice that we’ve encountered a new character ('b'
), so we will yield a CharCount
instance containing the previous character ('a'
) and the number of times we counted seeing it (2).
Then this block of code is suspended until another element is requested from the sequence.
Once that happens, it will resume executing at the count = 0L
line, and continue until it finds the next character change.
The problem specifies two types of special string we need to count. The first is where all the characters are the same.
If you know the length of a string and you know that it contains all the same character, then it doesn’t matter what that character is. The number of different substrings that can be formed depends entirely on the length:
"a"
, there’s one (["a"]
)."aa"
, there’s three (["a", "aa", "a"]
)."aaa"
, there’s six (["a", "aa", "aaa", "a", "aa", "a"]
)So, using that formula and our sequence of consecutive-character counts from above, we can calculate the number of type-1 special strings efficiently:
fun sumToN(n: Long) = n * (n + 1) / 2 fun type1s(s: String) = consecutives(s).map { sumToN(it.count) }.sum()
The other type of special string consists of two strings that have all the same character, separated by a string of length 1 with a different character.
Using the sequence of consecutive character counts from above, we can efficiently find all the groups of three that contain type-2 special strings:
fun isType2Triplet(t: List<CharCount>) = t[1].count == 1L && t[0].char == t[2].char fun type2Triplets(s: String) = consecutives(s).windowed(3).filter(::isType2Triplet)
windowed(3)
gives us a sequence that contains groups of three elements starting from each element of the original sequence.
To illustrate:
expression | result |
---|---|
s | "aaabaacc" |
consecutives(s) |
[CharCount('a', 3), CharCount('b', 1), CharCount('a', 2), CharCount('c', 2)] |
consecutives(s).windowed(3) |
[[CharCount('a', 3), CharCount('b', 1), CharCount('a', 2)], [CharCount('b', 1), CharCount('a', 2), CharCount('c', 2)]] |
We then filter down that list to just those groups of three in which the first and third use the same character, and the middle is of length 1.
How do you figure out how many type-2 special strings are in each of those groups? Consider two examples:
"aaabaa"
(represented by [CharCount('a', 3), CharCount('b', 1), CharCount('a', 2)]
), there are 2: "aba"
and "aabaa"
."abaa"
(represented by [CharCount('a', 1), CharCount('b', 1), CharCount('a', 2)]
), there’s just 1: "aba"
.You can always form progressively longer type-2 special strings, working outward from the center, until one side or the other runs out of identical characters. So the number of type-2 specials is just the lesser of the character count on the left and the character count on the right:
fun type2s(s: String) = type2Triplets(s).map { minOf(it[0].count, it[2].count) }.sum()
Now all we need to do is add the totals of the type-1 and type-2 special strings:
fun substrCount(n: Int, s: String) = type1s(s) + type2s(s)
Here’s the complete code again (excluding the main
method, for which you can use the HackerRank template):
data class CharCount(val char: Char, val count: Long) fun consecutives(s: String) = sequence { var count = 1L for (i in 1..s.length) { if (i == s.length || s[i] != s[i-1]) { yield(CharCount(s[i-1], count)) count = 0L } count++ } } fun sumToN(n: Long) = n * (n + 1) / 2 fun type1s(s: String) = consecutives(s).map { sumToN(it.count) }.sum() fun isType2Triplet(t: List<CharCount>) = t[1].count == 1L && t[0].char == t[2].char fun type2Triplets(s: String) = consecutives(s).windowed(3).filter(::isType2Triplet) fun type2s(s: String) = type2Triplets(s).map { minOf(it[0].count, it[2].count) }.sum() fun substrCount(n: Int, s: String) = type1s(s) + type2s(s)
The consecutives
function looks at each character of the string only once, making it O(N) time complexity.1 map
, sum
, windowed
, and filter
should all only need to look at each element of their input sequences once2, and emit at most one element for each input element3, so they don’t push our runtime beyond O(N).
The space complexity of this solution, if you ignore the memory used by the input string, is O(1)4. We’re never holding more than three CharCount instances in memory at a time.
type1s
and type2Triplets
), which is wasteful - you could refactor the program to only call it once - but this just adds a constant factor to the runtime (O(2N)), which is negligible when comparing against algorithms of O(N2), so we can ignore that in our analysis.windowed(3)
presumably has to look at each element three times, but again, that’s a negligible constant factor.windowed(3)
emits a list of three elements for each input element, but since that’s a constant size, it doesn’t affect this analysis.map
, sum
, windowed(3)
, and filter
all are implemented in a way that uses O(1) space, which should be possible since we’re dealing with lazy sequences - they don’t need to store previous or subsequent elements (except for exactly 2 elements for windowed
).