Asked  7 Months ago    Answers:  5   Viewed   33 times

I'm doing a Node.js project that contains sub projects. One sub project will have one Mongodb database and Mongoose will be use for wrapping and querying db. But the problem is

  • Mongoose doesn't allow to use multiple databases in single mongoose instance as the models are build on one connection.
  • To use multiple mongoose instances, Node.js doesn't allow multiple module instances as it has caching system in require(). I know disable module caching in Node.js but I think it is not the good solution as it is only need for mongoose.

    I've tried to use createConnection() and openSet() in mongoose, but it was not the solution.

    I've tried to deep copy the mongoose instance (http://blog.imaginea.com/deep-copy-in-javascript/) to pass new mongoose instances to the sub project, but it throwing RangeError: Maximum call stack size exceeded.

I want to know is there anyways to use multiple database with mongoose or any workaround for this problem? Because I think mongoose is quite easy and fast. Or any other modules as recommendations?

 Answers

42

One thing you can do is, you might have subfolders for each projects. So, install mongoose in that subfolders and require() mongoose from own folders in each sub applications. Not from the project root or from global. So one sub project, one mongoose installation and one mongoose instance.

-app_root/
--foo_app/
---db_access.js
---foo_db_connect.js
---node_modules/
----mongoose/
--bar_app/
---db_access.js
---bar_db_connect.js
---node_modules/
----mongoose/

In foo_db_connect.js

var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/foo_db');
module.exports = exports = mongoose;

In bar_db_connect.js

var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/bar_db');
module.exports = exports = mongoose;

In db_access.js files

var mongoose = require("./foo_db_connect.js"); // bar_db_connect.js for bar app

Now, you can access multiple databases with mongoose.

Tuesday, June 1, 2021
 
Nate
answered 7 Months ago
17

It is possible to use database tables from different databases in one query, if your current connection is allowed to access both databases.

You just need to prefix every table name with the database name:

SELECT * FROM `databasename`.`tablename` ...  
... LEFT JOIN `databasename_2`.`tablename`....
Saturday, May 29, 2021
 
MannfromReno
answered 7 Months ago
69

So as you note, the default in mongoose is that when you "embed" data in an array like this you get an _id value for each array entry as part of it's own sub-document properties. You can actually use this value in order to determine the index of the item which you intend to update. The MongoDB way of doing this is the positional $ operator variable, which holds the "matched" position in the array:

Folder.findOneAndUpdate(
    { "_id": folderId, "permissions._id": permission._id },
    { 
        "$set": {
            "permissions.$": permission
        }
    },
    function(err,doc) {

    }
);

That .findOneAndUpdate() method will return the modified document or otherwise you can just use .update() as a method if you don't need the document returned. The main parts are "matching" the element of the array to update and "identifying" that match with the positional $ as mentioned earlier.

Then of course you are using the $set operator so that only the elements you specify are actually sent "over the wire" to the server. You can take this further with "dot notation" and just specify the elements you actually want to update. As in:

Folder.findOneAndUpdate(
    { "_id": folderId, "permissions._id": permission._id },
    { 
        "$set": {
            "permissions.$.role": permission.role
        }
    },
    function(err,doc) {

    }
);

So this is the flexibility that MongoDB provides, where you can be very "targeted" in how you actually update a document.

What this does do however is "bypass" any logic you might have built into your "mongoose" schema, such as "validation" or other "pre-save hooks". That is because the "optimal" way is a MongoDB "feature" and how it is designed. Mongoose itself tries to be a "convenience" wrapper over this logic. But if you are prepared to take some control yourself, then the updates can be made in the most optimal way.

So where possible to do so, keep your data "embedded" and don't use referenced models. It allows the atomic update of both "parent" and "child" items in simple updates where you don't need to worry about concurrency. Probably is one of the reasons you should have selected MongoDB in the first place.

Wednesday, June 2, 2021
 
Raef
answered 7 Months ago
47

You'll wish you had used separate databases:

  • If you ever want to grant permissions to the databases themselves to clients or superusers.
  • If you ever want to restore just one client's database without affecting the data of the others.
  • If there are regulatory concerns governing your data and data breaches, and you belatedly discover that these regulations can only be met by having separate databases.
  • If you ever want to easily move your customer data to multiple database servers or otherwise scale out, or move larger/more important customers to different hardware. In a different part of the world.
  • If you ever want to easily archive and decommission old customer data.
  • If your customers care about their data being siloed, and they find out that you did otherwise.
  • If your data is subpoenaed and it's hard to extract just one customer's data, or the subpoena is overly broad and you have to produce the entire database instead of just the data for the one client.
  • When you forget to maintain vigilance and just one query slips through that didn't include AND CustomerID = @CustomerID. Hint: use a scripted permissions tool, or schemas, or wrap all tables with views that include WHERE CustomerID = SomeUserReturningFunction(), or some combination of these.
  • When you get permissions wrong at the application level and customer data is exposed to the wrong customer.
  • When you want to have different levels of backup and recovery protection for different clients.
  • Once you realize that building an infrastructure to create, provision, configure, deploy, and otherwise spin up/down new databases is worth the investment because it forces you to get good at it.
  • When you didn't allow for the possibility of some class of people needing access to multiple customers' data, and you need a layer of abstraction on top of Customer because WHERE CustomerID = @CustomerID won't cut it now.
  • When hackers target your sites or systems, and you made it easy for them to get all the data of all your customers in one fell swoop after getting admin credentials in just one database.
  • When your database backup takes 5 hours to run and then fails.
  • When you have to get the Enterprise edition of your DBMS so you can make compressed backups so that copying the backup file over the network takes less than 5 hours more.
  • When you have to restore the entire database every day to a test server which takes 5 hours, and run validation scripts that take 2 hours to complete.
  • When only a few of your customers need replication and you have to apply it to all of your customers instead of just those few.
  • When you want to take on a government customer and find out that they require you to use a separate server and database, but your ecosystem was built around a single server and database and it's just too hard or will take too long to change.

You'll be glad you used separate databases:

  • When a pilot rollout to one customer completely explodes and the other 999 customers are completely unaffected. And you can restore from backup to fix the problem.
  • When one of your database backups fails and you can fix just that one in 25 minutes instead of starting the entire 10-hour process over again.

You'll wish you had used a single database:

  • When you discover a bug that affects all 1000 clients and deploying the fix to 1000 databases is hard.
  • When you get permissions wrong at the database level and customer data is exposed to the wrong customer.
  • When you didn't allow for the possibility of some class of people needing access to a subset of all the databases (perhaps two customers merge).
  • When you didn't think how hard it would be to merge two different databases of data.
  • When you've merged two different databases of data and realize one was the wrong one, and you didn't plan for recovering from this scenario.
  • When you try to grow past 32,767 customers/databases on a single server and find out that this is the maximum in SQL Server 2012.
  • When you realize that managing 1,000+ databases is a bigger nightmare than you ever imagined.
  • When you realize that you can't onboard a new customer just by adding some data in a table, and you have to run a bunch of scary and complicated scripts to create, populate, and set permissions on a new database.
  • When you have to run 1000 database backups every day, make sure they all succeed, copy them over the network, restore them all to a test database, and run validation scripts on each single one, reporting any failures in a way that will guaranteed to be seen, and which are easily and quickly actionable. And then 150 of these fail in various places and have to be fixed one at a time.
  • When you find out you have to set up replication for 1000 databases.

Just because I listed more reasons for one doesn't mean it is better.

Some readers may get value from MSDN: Multi-Tenant Data Architecture. Or perhaps SaaS Tenancy App Design Patterns. Or even Developing Multi-tenant Applications for the Cloud, 3rd Edition

Saturday, July 31, 2021
 
PJQuakJag
answered 5 Months ago
83

Actually two questions, you usually do better asking one, just for future reference.

1. Pluralization

Short form is that it is good practice. In more detail, this is generally logical as what you are referring to is a "collection" of items or objects rather. So the general inference in a "collection" is "many" and therefore a plural form of what the "object" itself is named.

So a "people" collection implies that it is in fact made up of many "person" objects, just as "dogs" to "dog" or "cats" to "cat". Not necessarily "bovines" to "cow", but generally speaking mongoose does not really deal with Polymorphic entities, so there would not be "bull" or "bison" objects in there unless just specified by some other property to "cow".

You can of course change this if you want in either of these forms and specify your own name:

var personSchema = new Schema({ ... },{ "collection": "person" });

mongoose.model( "Person", personSchema, "person" );

But a model is general a "singular" model name and the "collection" is the plural form of good practice when there are many. Besides, every SQL database ORM I can think of also does it this way. So really this is just following the practice that most people are already used to.

2. Why Schema?

MongoDB is actually "schemaless", so it does not have any internal concept of "schema", which is one big difference from SQL based relational databases which hold their own definition of "schema" in a "table" definition.

While this is often actually a "strength" of MongoDB in that data is not tied to a certain layout, some people actually like it that way, or generally want to otherwise encapsulate logic that governs how data is stored.

For these reasons, mongoose supports the concept of defining a "Schema". This allows you to say "which fields" are "allowed" in the collection (model) this is "tied" to, and which "type" of data may be contained.

You can of course have a "schemaless" approach, but the schema object you "tie" to your model still must be defined, just not "strictly":

var personSchema = new Schema({ },{ "strict": false });
mongoose.model( "Person", personSchema );

Then you can pretty much add whatever you want as data without any restriction.

The reverse case though is that people "usually" do want some type of rules enforced, such as which fields and what types. This means that only the "defined" things can happen:

var personSchema = new Schema({
    name: { type: String, required: true },
    age: Number,
    sex: { type: String, enum: ["M","F"] },
    children: [{ type: Schema.Types.ObjectId, ref: "Person" }],
    country: { type: String, default: "Australia" }
});

So the rules there break down to:

  1. "name" must have "String" data in it only. Bit of a JavaScript idiom here as everything in JavaScript will actually stringify. The other thing on here is "required", so that if this field is not present in the object sent to .save() it will throw a validation error.

  2. "age" must be numeric. If you try to .save() this object with data other than numeric supplied in this field then you will throw a validation error.

  3. "sex" must be a string again, but this time we are adding a "constraint" to say what the valid value are. In the same way this also can throw a validation error if you do not supply the correct data.

  4. "children" actually an Array of items, but these are just "reference" ObjectId values that point to different items in another model. Or in this case this one. So this will keep that ObjectId reference in there when you add to "children". Mongoose can actually .populate() these later with their actual "Person" objects when requested to do so. This emulates a form of "embedding" in MongoDB, but used when you actually want to store the object separately without "embedding" every time.

  5. "country" is again just a String and requires nothing special, but we give it a default value to fill in if no other is supplied explicitly.


There are many other things you can do, I would suggest really reading through the documentation. Everything is explained in a lot of detail there, and if you have specific questions then you can always ask, "here" (for example).

So MongoDB does things differently to how SQL databases work, and throws out some of the things that are generally held in "opinion" to be better implemented at the application business logic layer anyway.

Hence in Mongoose, it tries to "put back" some of the good things people like about working with traditional relational databases, and allow some rules and good practices to be easily encapsulated without writing other code.

There is also some logic there that helps in "emulating" ( cannot stress enough ) "joins", as there are methods that "help" you in being able to retrieve "related" data from other sources, by essentially providing definitions where which "model" that data resides in within the "Schema" definition.

Did I also not mention that "Schema" definitions are again just objects and re-usable? Well yes they are an can in fact be tied to "many" models, which may or may not reside on the same database.

Everything here has a lot more function and purpose than you are currently aware of, the good advice here it to head forth and "learn". That is the usual path to the realization ... "Oh, now I see, that's what they do it that way".

Monday, November 1, 2021
 
Jim
answered 1 Month ago
Jim
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share