INDEX
Explanations
numbers and symbols indicative of measurement or mathematical concepts
New Auto-Interp
Negative Logits
this
-0.73
this
-0.68
↵
-0.65
这个
-0.58
*
-0.57
..
-0.57
**
-0.56
那个
-0.56
Variante
-0.56
!!!
-0.54
POSITIVE LOGITS
ſelf
0.82
Portale
0.75
GIPHY
0.75
―――――
0.75
pleaſure
0.74
ſelves
0.73
་་
0.69
ſtate
0.68
Houſe
0.68
!")
0.68
Activations Density 0.302%