The issue was to find the longest common substring.
What this means is: Given the word "abracadabra" you are looking for the longest repeated set of characters.
The longest repeated sub-string is "abra" which appears at positions 0 and 7 (counting from 0).
The solution the person presented involved putting the strings from smallest to largest into a dictionary (hash table) with counts of iterations.
a-4
b-2
ab-2
And then walking the tree to find the largest result. This solution seems odd to me.
My solution is to make a dynamic array that can be queried for contents. Further, doing it from longest to shortest:
Edit
End Edit
Solution:
Get longest string: abracadabra
Is string in array? YES? got answer and bail; No --> add to array
*Reduce look string by 1 character
Move to start of target string:
**Get look string: abracadabr
Is string in array? YES? got answer and bail; No --> add to array
Is look string at end of target string? Yes --> loop back to *; No --> move forward one character and loop back to **
This is massively not optimised for time or processing. But it's a first stab.
Also, just pseudo code... loop counters would be my next pass.
Comments? Improvements?
Oooh, good point.
Re: Oooh, good point.
Re: Oooh, good point.
Maybe I don't understand, but ...
Re: Maybe I don't understand, but ...
So, going back to the example of "abracadabra"
We can't look for duplications of "abraca" which is 6 of 10 characters long because we'd need to move to the next position to compare it. However, there are only 4 characters left, so we won't find that string again.
As a result, we start from the largest duplicatable string, which in 10 characters would be 5.
Now theoretically if you had a string like "AAAAAA" and looked for "AAAAA" that does exist twice. So I'd have to throw that back to
Edited at 2009-03-11 03:19 pm (UTC)
Re: Maybe I don't understand, but ...
Re: Maybe I don't understand, but ...
But as pointed out later, overlap means that my contribution is shot.