vendredi 20 février 2015

How to Use Set.toArray() for sorting Strings?



I've found many answers to converting a Set to an ArrayList, but none of it really helps my problem. I have a program here that reads from a file.txt and determines how similar sentences are to each other using a Jaccard Similarity Matrix. Contents in the file.txt are as follows:



The cat in the hat

The cat sat on the mat

Pigs in a blanket



I then have a for loop that pairs each char on each line with the adjacent char and puts them in a HashSet to ensure uniqueness. Like so:



[ c, in, h, i, t , n , at, Th, t, th, ha, e , he, ca]

[ c, t , m, sa, o, n , at, s, Th, t, th, ma, e , he, ca, on]

[ a, b, in, i, bl, gs, s , an, et, n , la, Pi, ke, nk, ig, a ]



My problem now is getting the pairs of chars out of the Set and into an ArrayList to be sorted and then each pair from a line compared to another pair of chars on another line using String.equals() for the Jaccard formula: J = number of matches / unique pairs. I have a single Set that is recycled after its filled with the first line.



HashSet<String> shingleTrimSet = new HashSet<String>();
List<String> shingleArrayList = new ArrayList<String>();

System.out.println("\nSorted Shingles:");

for(int i = 0; i < lineCount; i++){
shingleTrimSet.clear();

for(int idx = 0, jdx = 1; idx+1 < lines[i].length(); idx++, jdx++){
shingleTrimSet.add( lines[i].substring( idx, jdx+1 ) );
}
shingleTrimSet.toArray( new String[shingleTrimSet.size()] );

}


shingleTrimSet.toArray( new String[shingleTrimSet.size()] ) works in this scenario, but I don't know how to use it for something else later. How do I know what ArrayList the Set has been placed into? Its seems to not have a variable name.




Aucun commentaire:

Enregistrer un commentaire