INDEX
Explanations
phrases related to location or direction
conjunctions and linkers that connect ideas or clauses
New Auto-Interp
Negative Logits
ore
-0.69
enance
-0.68
Doodle
-0.67
argo
-0.65
rite
-0.61
itar
-0.61
Monteneg
-0.59
eer
-0.59
els
-0.59
decentral
-0.58
POSITIVE LOGITS
namely
0.99
whether
0.80
sqor
0.79
viz
0.79
anwhile
0.75
sidx
0.73
excluding
0.72
\":
0.71
partName
0.70
were
0.69
Activations Density 0.785%