INDEX
Explanations
acronyms or coded language referring to websites, ships, or scientific study
space, ship
New Auto-Interp
Negative Logits
ship
-1.92
s
-1.66
school
-1.66
space
-1.63
shop
-1.62
systems
-1.59
solutions
-1.55
sing
-1.55
soft
-1.55
ski
-1.54
POSITIVE LOGITS
s
0.84
S
0.72
Sj
0.63
س
0.61
ss
0.60
S
0.59
S
0.58
Sot
0.57
si
0.57
sb
0.57
Activations Density 4.020%