Skip to content

Init: KDTree#3

Open
Engelsgeduld wants to merge 2 commits intomainfrom
KD-Tree
Open

Init: KDTree#3
Engelsgeduld wants to merge 2 commits intomainfrom
KD-Tree

Conversation

@Engelsgeduld
Copy link
Owner

No description provided.

@Engelsgeduld Engelsgeduld changed the title Kd tree Init: KDTree Mar 11, 2025
Copy link

@maybenotilya maybenotilya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А ноутбуки где...

self.id = 1
self.size = size
self.heap: list[tuple[float, int, Optional[PointType]]] = [(-np.inf, 1, None)]
heapq.heapify(self.heap)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А зачем делать heapify для кучи из одного элемента?


class Heap:
def __init__(self, size: int):
self.id = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Не совсем понимаю предназначение id

class Heap:
def __init__(self, size: int):
self.id = 1
self.size = size

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ну скорее capacity

raise ValueError("Leaf size must be positive")

valid_points = self._validate_points(points)
self.dim: int = valid_points.shape[1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Наверное для удобства стоит в конструкторе объявить через None



class AbstractScaler(metaclass=ABCMeta):
def fit(self, data: np._typing.NDArray) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

В декоратор @abstractmethod стоило бы обернуть

raise ValueError("Features and targets must be same lenght")
self.model = KDTree(features, self.leaf_size, self.metric)
self.classifier = dict((tuple(pair[0]), pair[1]) for pair in zip(features.tolist(), targets.tolist()))
self.targets = targets

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Не понимаю зачем хранить таргеты отдельно, если они уже есть в self.classifier

Comment on lines +30 to +36
if point in self.classifier:
probability.append(
(
np.unique(self.targets),
(self.classifier[point] == np.unique(self.targets)).astype(int),
)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Слабо понимаю что тут вообще происходит, но кажется это некорректно, как минимум потому что могут быть две одинаковые точки с разными лейблами

)
)
else:
result = self.model.query([point], self.k)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

У тебя model.query принимает много точек сразу, почему бы их всех туда не передать?

result = self.model.query([point], self.k)
target_result = np.array([self.classifier[tuple(neighbors.tolist())] for neighbors in result[0]])
counts = np.array([(target_result == val).sum() for val in np.unique(self.targets)])
probability.append((self.targets, counts / len(result[0])))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Кажется сохранять в каждом предикте таргеты это оверхед по памяти, у тебя они всё равно используются только в классе

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А еще почему выше np.unique(self.targets), а тут просто self.targets

def transform(self, data: np._typing.NDArray) -> np._typing.NDArray:
if self.median is None or self.iqr is None:
raise ValueError("Scaler unfitted")
return (data - self.median) / self.iqr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.iqr может быть равным нулю

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants