Skip to content

Conversation

@Catatomik
Copy link
Member

@Catatomik Catatomik commented Feb 24, 2022

Test BIBM with RAPTOR

See w/@GhostOcter

@Catatomik Catatomik self-assigned this Feb 24, 2022
@Catatomik
Copy link
Member Author

⚠️ Currently not viable : memory leak from worker pool (8.5 Go RAM at 900 stops computed over 4400).

@Catatomik
Copy link
Member Author

⚡ Optimizations

  • All-to-all, no constraint : 10 minuts/1.8Go
  • All-to-all, 5km constraint : 2 minuts/600Mo

@Catatomik
Copy link
Member Author

Catatomik commented Feb 25, 2023

Benchmarks at this moment

To :

  1. Query the database
  2. Make the foot graph
  3. Compute approached stops
  4. Refresh the graph by inserting approached stops
  5. Compute paths from all stops to all stops
  6. Update the database (insert graph & paths)
  • ❌ Compute & write GEOJSONS ➡️ need more RAM (store more data on edges/nodes)

Domain restricted to 44.813926, -0.581271 - 44.793123, -0.632578

  • Without paths computing : 00:04:669 run 10 times, debug mode
  • With paths computing : 00:04:210 run 10 times, debug mode

No domain restriction

  • Without paths computing
    • With 5km restriction : 02:52:649 run 1 time, 1.5GB peak, 300MB end
    • Without distance restriction : ?
  • With paths computing
    • With 5km restriction : 04:02:903, run 1 time, 1.6GB peak, 300MB end
    • Without distance restriction : ?

@Catatomik
Copy link
Member Author

Actions will now fail because pnpm install cannot succeed: it would require to build bibm, and it's currently impossible in GitHub CI (see Cata-Dev/Best-itinerary-Bordeaux-Metropole#1358).

Comment on lines +56 to +264
async function queryData(fpReqLen: number) {
console.debug("Querying data...");
const sourceDB = await initDB("bibm");
const computeDB = await initDB("bibm-compute");
const TBMStopsModel = TBMStopsModelInit(sourceDB);
const SNCFStopsModel = SNCFStopsModelInit(sourceDB);
const TBMSchedulesModel = TBMSchedulesModelInit(sourceDB)[1];
const TBMScheduledRoutesModel = TBMScheduledRoutesModelInit(sourceDB);
const SNCFScheduledRoutesModel = SNCFScheduledRoutesModelInit(sourceDB);
const NonScheduledRoutesModel = NonScheduledRoutesModelInit(sourceDB);

const resultModel = ResultModelInit(computeDB);

/** DB Types */

// Stops
// TBM
const dbTBMStopProjection = { _id: 1 } satisfies Partial<Record<keyof dbTBM_Stops, 1>>;
type TBMStop = Pick<dbTBM_Stops, keyof typeof dbTBMStopProjection>;

// SNCF
const dbSNCFStopProjection = { _id: 1 } satisfies Partial<Record<keyof dbSNCF_Stops, 1>>;
type SNCFStop = Pick<dbSNCF_Stops, keyof typeof dbSNCFStopProjection>;

// Schedules
const schedulesProjection = { arr_int_hor: 1, dep_int_hor: 1 } satisfies Partial<Record<keyof Schedule, 1>>;
type dbSchedule = Pick<Schedule, keyof typeof schedulesProjection>;

// Scheduled Routes
// TBM
const dbTBMSchedulesProjection = { hor_theo: 1 } satisfies Partial<Record<keyof dbTBM_Schedules_rt, 1>>;
const dbTBMScheduledRoutesProjection = { _id: 1, stops: 1, trips: 1 } satisfies Partial<Record<keyof dbTBM_ScheduledRoutes, 1>>;
type dbTBMScheduledRoute = Pick<dbTBM_ScheduledRoutes, keyof typeof dbTBMScheduledRoutesProjection>;
interface TBMScheduledRoutesOverwritten /* extends dbTBM_ScheduledRoutes */ {
_id: UnpackRefType<dbTBMScheduledRoute["_id"]>;
stops: UnpackRefType<dbTBMScheduledRoute["stops"]>;
trips: {
// Not a Document because of lean
schedules: (Pick<dbTBM_Schedules_rt, keyof typeof dbTBMSchedulesProjection> & dbSchedule)[];
}[];
}
type TBMScheduledRoute = Override<dbTBMScheduledRoute, TBMScheduledRoutesOverwritten>;

// SNCF
const dbSNCFSchedulesProjection = { baseArrival: 1, baseDeparture: 1 } satisfies Partial<Record<keyof dbSNCF_Schedules, 1>>;
const dbSNCFScheduledRoutesProjection = { _id: 1, stops: 1, trips: 1 } satisfies Partial<Record<keyof dbSNCF_ScheduledRoutes, 1>>;
type dbSNCFScheduledRoute = Pick<dbSNCF_ScheduledRoutes, keyof typeof dbSNCFScheduledRoutesProjection>;
interface SNCFScheduledRoutesOverwritten /* extends dbSNCF_ScheduledRoutes */ {
stops: UnpackRefType<dbSNCFScheduledRoute["stops"]>;
trips: {
// Not a Document because of lean
schedules: (Pick<dbSNCF_Schedules, keyof typeof dbSNCFSchedulesProjection> & dbSchedule)[];
}[];
}
type SNCFScheduledRoute = Omit<dbSNCFScheduledRoute, keyof SNCFScheduledRoutesOverwritten> & SNCFScheduledRoutesOverwritten;

type ProviderRouteId = TBMScheduledRoute["_id"] | SNCFScheduledRoute["_id"];
// eslint-disable-next-line @typescript-eslint/no-duplicate-type-constituents
type ProviderStopId = TBMStop["_id"] | SNCFStop["_id"];

// Non Schedules Routes
const dbNonScheduledRoutesProjection = { from: 1, to: 1, distance: 1 } satisfies Partial<Record<keyof dbFootPaths, 1>>;
type dbNonScheduledRoute = Pick<dbFootPaths, keyof typeof dbNonScheduledRoutesProjection>;

// Virtual IDs (stops routes) management

const stopIdsMappingF = new Map<`${Providers}-${ProviderStopId}`, number>();
const stopIdsMappingB = new Map<number, [ProviderStopId, Providers]>();
const TBMStopsCount = (await TBMStopsModel.estimatedDocumentCount()) * 1.5;
const SNCFStopsCount = (await SNCFStopsModel.estimatedDocumentCount()) * 1.5;
const stopIdsRanges = {
[Providers.TBM]: [0, TBMStopsCount, -1],
[Providers.SNCF]: [TBMStopsCount + 1, TBMStopsCount + 1 + SNCFStopsCount, -1],
} satisfies Record<string, [number, number, number]>;
const [mapStopId, unmapStopId] = makeMapId(stopIdsRanges, stopIdsMappingF, stopIdsMappingB);

const routeIdsMappingF = new Map<`${Providers}-${ProviderRouteId}`, number>();
const routeIdsMappingB = new Map<number, [ProviderRouteId, Providers]>();
// Memoizing allows us to only remember backward mapping, forward mapping is stored inside memoize
const TBMSRCount = (await TBMScheduledRoutesModel.estimatedDocumentCount()) * 1.5;
const SNCFSRCount = (await SNCFScheduledRoutesModel.estimatedDocumentCount()) * 1.5;
const routeIdsRanges = {
[Providers.TBM]: [0, TBMSRCount, -1],
[Providers.SNCF]: [TBMSRCount + 1, TBMSRCount + 1 + SNCFSRCount, -1],
} satisfies Record<string, [number, number, number]>;
const [mapRouteId, unmapRouteId] = makeMapId(routeIdsRanges, routeIdsMappingF, routeIdsMappingB);

// Non scheduled routes

// Query must associate (s, from) AND (from, s) forall s in stops !
const dbNonScheduledRoutes = (
(await NonScheduledRoutesModel.find<DocumentType<dbNonScheduledRoute>>(
{ distance: { $lte: fpReqLen } },
{ ...dbNonScheduledRoutesProjection, _id: 0 },
)
.lean()
.exec()) as dbNonScheduledRoute[]
).reduce<Map<number, ConstructorParameters<typeof RAPTORData<unknown, number, number>>[1][number][2]>>((acc, { from, to, distance }) => {
const mappedFrom = mapStopId(parseInt(from.substring(3).split("-")[0]), parseInt(from.split("-")[1]));
const mappedTo = mapStopId(parseInt(to.substring(3).split("-")[0]), parseInt(to.split("-")[1]));

for (const [from, to] of [
[mappedFrom, mappedTo],
[mappedTo, mappedFrom],
]) {
let stopNonScheduledRoutes = acc.get(from);
if (!stopNonScheduledRoutes) {
stopNonScheduledRoutes = [];
acc.set(from, stopNonScheduledRoutes);
}

stopNonScheduledRoutes.push({ length: distance, to });
}

return acc;
}, new Map());

// TBM stops & routes

const dbTBMScheduledRoutes = (
(await TBMScheduledRoutesModel.find<DocumentType<TBMScheduledRoute>>({}, dbTBMScheduledRoutesProjection)
.populate("trips.schedules", { ...schedulesProjection, ...dbTBMSchedulesProjection, _id: 0, __t: 0 })
.lean()
.exec()) as TBMScheduledRoute[]
).map(({ _id, stops, trips }) => ({
_id,
stops: stops.map((stop) => mapStopId(Providers.TBM, stop)),
trips,
}));

const TBMStops = dbTBMScheduledRoutes.reduce<Map<number, [number[], Exclude<ReturnType<(typeof dbNonScheduledRoutes)["get"]>, undefined>]>>(
(acc, { _id: routeId, stops }) => {
for (const stopId of stops) {
let stop = acc.get(stopId);
if (!stop) {
stop = [[], dbNonScheduledRoutes.get(stopId) ?? []];
acc.set(stopId, stop);
}

stop[0].push(mapRouteId(Providers.TBM, routeId));
}

return acc;
},
new Map(
(
(await TBMStopsModel.find<DocumentType<TBMStop>>({ coords: { $not: { $elemMatch: { $eq: Infinity } } } }, dbTBMStopProjection)
.lean()
.exec()) as TBMStop[]
).map(({ _id }) => {
const mappedId = mapStopId(Providers.TBM, _id);

return [mappedId, [[], dbNonScheduledRoutes.get(mappedId) ?? []]];
}),
),
);

// SNCF stops & routes

const dbSNCFScheduledRoutes = (
(await SNCFScheduledRoutesModel.find<DocumentType<SNCFScheduledRoute>>({}, dbSNCFScheduledRoutesProjection)
.populate("trips.schedules", { ...schedulesProjection, ...dbSNCFSchedulesProjection, _id: 0 })
.lean()
.exec()) as SNCFScheduledRoute[]
).map(({ _id, stops, trips }) => ({
_id,
stops: stops.map((stop) => mapStopId(Providers.SNCF, stop)),
trips,
}));

const SNCFStops = dbSNCFScheduledRoutes.reduce<Map<number, [number[], Exclude<ReturnType<(typeof dbNonScheduledRoutes)["get"]>, undefined>]>>(
(acc, { _id: routeId, stops }) => {
for (const stopId of stops) {
let stop = acc.get(stopId);
if (!stop) {
stop = [[], dbNonScheduledRoutes.get(stopId) ?? []];
acc.set(stopId, stop);
}

stop[0].push(mapRouteId(Providers.SNCF, routeId));
}

return acc;
},
new Map(
(
(await SNCFStopsModel.find<DocumentType<SNCFStop>>({ coords: { $not: { $elemMatch: { $eq: Infinity } } } }, dbSNCFStopProjection)
.lean()
.exec()) as SNCFStop[]
).map(({ _id }) => {
const mappedId = mapStopId(Providers.SNCF, _id);

return [mappedId, [[], dbNonScheduledRoutes.get(mappedId) ?? []]];
}),
),
);
return {
TBMStops,
SNCFStops,
dbTBMScheduledRoutes,
dbSNCFScheduledRoutes,
dbNonScheduledRoutes,
TBMSchedulesModel,
resultModel,
mapStopId,
unmapStopId,
mapRouteId,
unmapRouteId,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function with many returns (count = 6): queryData [qlty:return-statements]

Comment on lines +186 to +210
const TBMStops = dbTBMScheduledRoutes.reduce<Map<number, [number[], Exclude<ReturnType<(typeof dbNonScheduledRoutes)["get"]>, undefined>]>>(
(acc, { _id: routeId, stops }) => {
for (const stopId of stops) {
let stop = acc.get(stopId);
if (!stop) {
stop = [[], dbNonScheduledRoutes.get(stopId) ?? []];
acc.set(stopId, stop);
}

stop[0].push(mapRouteId(Providers.TBM, routeId));
}

return acc;
},
new Map(
(
(await TBMStopsModel.find<DocumentType<TBMStop>>({ coords: { $not: { $elemMatch: { $eq: Infinity } } } }, dbTBMStopProjection)
.lean()
.exec()) as TBMStop[]
).map(({ _id }) => {
const mappedId = mapStopId(Providers.TBM, _id);

return [mappedId, [[], dbNonScheduledRoutes.get(mappedId) ?? []]];
}),
),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 26 lines of similar code in 2 locations (mass = 172) [qlty:similar-code]

Comment on lines +226 to +250
const SNCFStops = dbSNCFScheduledRoutes.reduce<Map<number, [number[], Exclude<ReturnType<(typeof dbNonScheduledRoutes)["get"]>, undefined>]>>(
(acc, { _id: routeId, stops }) => {
for (const stopId of stops) {
let stop = acc.get(stopId);
if (!stop) {
stop = [[], dbNonScheduledRoutes.get(stopId) ?? []];
acc.set(stopId, stop);
}

stop[0].push(mapRouteId(Providers.SNCF, routeId));
}

return acc;
},
new Map(
(
(await SNCFStopsModel.find<DocumentType<SNCFStop>>({ coords: { $not: { $elemMatch: { $eq: Infinity } } } }, dbSNCFStopProjection)
.lean()
.exec()) as SNCFStop[]
).map(({ _id }) => {
const mappedId = mapStopId(Providers.SNCF, _id);

return [mappedId, [[], dbNonScheduledRoutes.get(mappedId) ?? []]];
}),
),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 26 lines of similar code in 2 locations (mass = 172) [qlty:similar-code]

Comment on lines +419 to +428
async function insertResults<TimeVal extends Timestamp | InternalTimeInt, V, CA extends [V, string][]>(
resultModel: Awaited<ReturnType<typeof queryData>>["resultModel"],
unmapStopId: Awaited<ReturnType<typeof queryData>>["unmapStopId"],
unmapRouteId: Awaited<ReturnType<typeof queryData>>["unmapRouteId"],
timeType: Time<TimeVal>,
from: [mappedId: number, LocationAddress | LocationTBM],
to: [mappedId: number, LocationAddress | LocationTBM],
departureTime: TimeVal,
settings: RAPTORRunSettings,
results: ReturnType<BaseRAPTOR<TimeVal, SharedID, number, V, CA>["getBestJourneys"]>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function with many parameters (count = 9): insertResults [qlty:function-parameters]

Comment on lines +34 to +48
async function lintScheduledRoutes({ dbScheduledRoutes, TBMScheduledRoutesModel, TBMLinesRoutesModel }: Awaited<ReturnType<typeof queryData>>) {
// Lint schedules routes
console.debug("Linting...");

for await (const scheduledRoute of dbScheduledRoutes)
for (const [tripIndex, trip] of scheduledRoute.trips.entries())
for (const [i, schedule] of trip.schedules.entries())
if (typeof schedule !== "object" || (schedule.rs_sv_arret_p !== Infinity && schedule.rs_sv_arret_p !== scheduledRoute.stops[i])) {
const scheduledRoutePop = await TBMScheduledRoutesModel.populate(scheduledRoute, { path: "_id", options: { lean: true } });
scheduledRoutePop._id = await TBMLinesRoutesModel.populate(scheduledRoutePop._id, { path: "rs_sv_ligne_a", options: { lean: true } });

console.log(
`Route ${typeof scheduledRoutePop._id === "object" ? `${typeof scheduledRoutePop._id.rs_sv_ligne_a === "object" ? scheduledRoutePop._id.rs_sv_ligne_a.libelle : scheduledRoutePop._id.rs_sv_ligne_a} ${scheduledRoutePop._id.libelle} (${scheduledRoutePop._id._id})` : scheduledRoutePop._id}, trip idx ${tripIndex} (${trip.tripId}): at idx ${i}, schedule's stop ${typeof schedule === "object" ? `${schedule.rs_sv_arret_p as number} / ${schedule.hor_theo.toLocaleString()}` : `${schedule} [NOT POPULATED]`} !== stop ${scheduledRoute.stops[i]}`,
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function with high complexity (count = 28): lintScheduledRoutes [qlty:function-complexity]

scheduledRoutePop._id = await TBMLinesRoutesModel.populate(scheduledRoutePop._id, { path: "rs_sv_ligne_a", options: { lean: true } });

console.log(
`Route ${typeof scheduledRoutePop._id === "object" ? `${typeof scheduledRoutePop._id.rs_sv_ligne_a === "object" ? scheduledRoutePop._id.rs_sv_ligne_a.libelle : scheduledRoutePop._id.rs_sv_ligne_a} ${scheduledRoutePop._id.libelle} (${scheduledRoutePop._id._id})` : scheduledRoutePop._id}, trip idx ${tripIndex} (${trip.tripId}): at idx ${i}, schedule's stop ${typeof schedule === "object" ? `${schedule.rs_sv_arret_p as number} / ${schedule.hor_theo.toLocaleString()}` : `${schedule} [NOT POPULATED]`} !== stop ${scheduledRoute.stops[i]}`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 2 issues:

1. Deeply nested control flow (level = 5) [qlty:nested-control-flow]


2. Deeply nested control flow (level = 5) [qlty:nested-control-flow]

Comment on lines +10 to +16
export async function benchmark<F extends (...args: any[]) => any>(
this: unknown,
f: F,
args: Parameters<F>,
thisArg: unknown = this, // TODO : deeper look on thisArg + ThisType
times = 1,
logStats = true,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function with many parameters (count = 7): benchmark [qlty:function-parameters]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants