comparison test-data/sample_text_frequency.dat @ 0:e991d4e60c17 draft

planemo upload commit 0203cb3a0b40d9348674b2b098af805e2986abca-dirty
author stevecassidy
date Wed, 12 Oct 2016 22:17:53 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:e991d4e60c17
1 Word Count Percent
2 the 44 6.32
3 of 26 3.74
4 and 25 3.59
5 . 24 3.45
6 to 23 3.30
7 a 15 2.16
8 , 12 1.72
9 for 12 1.72
10 will 12 1.72
11 is 11 1.58
12 DADA 9 1.29
13 some 8 1.15
14 ( 7 1.01
15 be 7 1.01
16 on 7 1.01
17 that 7 1.01
18 this 7 1.01
19 Australian 7 1.01
20 ) 7 1.01
21 The 7 1.01
22 text 6 0.86
23 project 6 0.86
24 we 6 0.86
25 infrastructure 6 0.86
26 from 6 0.86
27 have 6 0.86
28 in 6 0.86
29 video 5 0.72
30 language 5 0.72
31 data 5 0.72
32 it 5 0.72
33 collection 5 0.72
34 annotation 5 0.72
35 Corpus 4 0.57
36 with 4 0.57
37 build 4 0.57
38 audio 4 0.57
39 hope 3 0.43
40 collections 3 0.43
41 resources 3 0.43
42 funding 3 0.43
43 available 3 0.43
44 English 3 0.43
45 meta-data 3 0.43
46 Macquarie 3 0.43
47 done 3 0.43
48 two 3 0.43
49 corpus 3 0.43
50 part 3 0.43
51 work 3 0.43
52 up 3 0.43
53 at 3 0.43
54 - 3 0.43
55 code 2 0.29
56 people 2 0.29
57 We 2 0.29
58 but 2 0.29
59 has 2 0.29
60 them 2 0.29
61 example 2 0.29
62 words 2 0.29
63 using 2 0.29
64 now 2 0.29
65 collect 2 0.29
66 each 2 0.29
67 corpora 2 0.29
68 year 2 0.29
69 server 2 0.29
70 new 2 0.29
71 public 2 0.29
72 by 2 0.29
73 search 2 0.29
74 store 2 0.29
75 involves 2 0.29
76 within 2 0.29
77 texts 2 0.29
78 support 2 0.29
79 Language 2 0.29
80 sentences 2 0.29
81 freely 2 0.29
82 National 2 0.29
83 funded 2 0.29
84 site 2 0.29
85 an 2 0.29
86 as 2 0.29
87 able 2 0.29
88 make 2 0.29
89 subjects 2 0.29
90 speech 2 0.29
91 development 2 0.29
92 recording 2 0.29
93 I 2 0.29
94 significant 2 0.29
95 task 2 0.29
96 provide 2 0.29
97 ARC 2 0.29
98 demo 1 0.14
99 automatically 1 0.14
100 What 1 0.14
101 Service 1 0.14
102 being 1 0.14
103 both 1 0.14
104 soon 1 0.14
105 existing 1 0.14
106 large 1 0.14
107 via 1 0.14
108 looks 1 0.14
109 Haugh 1 0.14
110 still 1 0.14
111 find 1 0.14
112 alignment 1 0.14
113 web 1 0.14
114 Recently 1 0.14
115 writing 1 0.14
116 linguistics 1 0.14
117 only 1 0.14
118 going 1 0.14
119 systems 1 0.14
120 under 1 0.14
121 Using 1 0.14
122 2011 1 0.14
123 take 1 0.14
124 move 1 0.14
125 around 1 0.14
126 get 1 0.14
127 read 1 0.14
128 providing 1 0.14
129 Michael 1 0.14
130 number 1 0.14
131 Project 1 0.14
132 next 1 0.14
133 While 1 0.14
134 Oz 1 0.14
135 communities 1 0.14
136 comes 1 0.14
137 projects 1 0.14
138 articles 1 0.14
139 like 1 0.14
140 visible 1 0.14
141 manual 1 0.14
142 solution 1 0.14
143 've 1 0.14
144 capability 1 0.14
145 these 1 0.14
146 continue 1 0.14
147 steps 1 0.14
148 common 1 0.14
149 small 1 0.14
150 Speech 1 0.14
151 fixed 1 0.14
152 Griffith 1 0.14
153 searching 1 0.14
154 core 1 0.14
155 doing 1 0.14
156 Since 1 0.14
157 idea 1 0.14
158 All 1 0.14
159 titles 1 0.14
160 are 1 0.14
161 picked 1 0.14
162 Some 1 0.14
163 network 1 0.14
164 renamed 1 0.14
165 managing 1 0.14
166 sites 1 0.14
167 publish 1 0.14
168 research 1 0.14
169 Later 1 0.14
170 AusNC 1 0.14
171 written 1 0.14
172 between 1 0.14
173 technology 1 0.14
174 reading 1 0.14
175 can 1 0.14
176 recently 1 0.14
177 repository 1 0.14
178 partners 1 0.14
179 This 1 0.14
180 University 1 0.14
181 hosted 1 0.14
182 free 1 0.14
183 box 1 0.14
184 exposing 1 0.14
185 technical 1 0.14
186 study 1 0.14
187 allows 1 0.14
188 forced 1 0.14
189 Sign 1 0.14
190 published 1 0.14
191 map 1 0.14
192 MQ 1 0.14
193 month 1 0.14
194 interviews 1 0.14
195 software 1 0.14
196 already 1 0.14
197 useful 1 0.14
198 secure 1 0.14
199 'black 1 0.14
200 primary 1 0.14
201 whatever 1 0.14
202 Update 1 0.14
203 1000 1 0.14
204 parties 1 0.14
205 loaded 1 0.14
206 centralised 1 0.14
207 Auslan 1 0.14
208 1900 1 0.14
209 size 1 0.14
210 little 1 0.14
211 Australia 1 0.14
212 initial 1 0.14
213 been 1 0.14
214 Early 1 0.14
215 their 1 0.14
216 station 1 0.14
217 down 1 0.14
218 basic 1 0.14
219 collected 1 0.14
220 : 1 0.14
221 Data 1 0.14
222 ANDS 1 0.14
223 more 1 0.14
224 describe 1 0.14
225 HCSNet 1 0.14
226 denoting 1 0.14
227 interviewed 1 0.14
228 Trevor 1 0.14
229 bitbucket 1 0.14
230 testing 1 0.14
231 Johnston 1 0.14
232 effort 1 0.14
233 pilot 1 0.14
234 upgrades 1 0.14
235 main 1 0.14
236 look 1 0.14
237 developing 1 0.14
238 reliable 1 0.14
239 pace 1 0.14
240 while 1 0.14
241 technoogy 1 0.14
242 install 1 0.14
243 Our 1 0.14
244 transcripts 1 0.14
245 country 1 0.14
246 descriptions 1 0.14
247 due 1 0.14
248 documentation 1 0.14
249 allowed 1 0.14
250 sample 1 0.14
251 enable 1 0.14
252 create 1 0.14
253 demonstration 1 0.14
254 Map 1 0.14
255 speakers 1 0.14
256 inside 1 0.14
257 end 1 0.14
258 sessions 1 0.14
259 things 1 0.14
260 permission 1 0.14
261 feature 1 0.14
262 who 1 0.14
263 started 1 0.14
264 which 1 0.14
265 digital 1 0.14
266 many 1 0.14
267 outside 1 0.14
268 used 1 0.14
269 's 1 0.14
270 separate 1 0.14
271 collaboration 1 0.14
272 after 1 0.14
273 driver 1 0.14
274 needs 1 0.14
275 moment 1 0.14
276 important 1 0.14
277 designed 1 0.14
278 tidying 1 0.14
279 services 1 0.14
280 elicit 1 0.14
281 AusTalk 1 0.14
282 expand 1 0.14
283 stereo 1 0.14
284 natural 1 0.14
285 ' 1 0.14
286 third 1 0.14
287 later 1 0.14
288 game 1 0.14
289 An 1 0.14
290 As 1 0.14
291 so 1 0.14
292 Big 1 0.14
293 allow 1 0.14
294 sets 1 0.14