INDEX
Explanations
references to phenomena and their characteristics
New Auto-Interp
Negative Logits
définiti
-0.43
Stored
-0.42
väg
-0.41
hjär
-0.40
üstung
-0.40
becauſe
-0.40
Reſ
-0.40
dug
-0.40
vertrou
-0.39
Eſ
-0.39
POSITIVE LOGITS
phenomena
1.49
phenomenon
1.43
Phenomena
1.34
fenomeno
1.27
phenomen
1.16
Phenomen
1.16
fenómeno
1.16
phénomènes
1.13
fenô
1.10
phénomène
1.09
Activations Density 0.015%