INDEX
Explanations
terms indicating structural or physical attributes
New Auto-Interp
Negative Logits
owell
-0.16
Yates
-0.15
Olson
-0.15
lessness
-0.14
oa
-0.14
iness
-0.14
kova
-0.14
597
-0.14
isy
-0.14
ductive
-0.14
POSITIVE LOGITS
-wise
0.43
wise
0.40
wise
0.39
ewise
0.36
wards
0.35
istically
0.35
atically
0.34
aneously
0.34
lessly
0.32
ensively
0.31
Activations Density 0.232%