Configuring remote Nano Client

The remote.yml contains the properties of Nano, like served resources, resource permissions, authenticity, indexing options and more. When the owner changes a configuration locally by the UI or remotely from the browser this file is updated.

This file can be updated remotely as well by the Nano API for the convenience of the owner. Because of the security risks coming with this convenience, the owner is highly encouraged to consider their security needs and restrict editing this file by either:

  • Set remote_admin_policy in the local.yml to either restrict or deny

  • Make this file read-only for the Nano process

Depending on the choice, a high degree of security will come with some inconvenience. If this file is read-only or it’s editing is disabled by policy, the owner will have to revert these settings to make changes to the configurations in this file through the Nano API. This either requires physical access by the owner or the use of a secure remote management system such as SSH, RPD, etc.

name

default: “” (hostname)

Name of the Nano instance for convenience. By default this will be the hostname of the machine when a new configuration is generated.

admin_password

default: null

The admin password is only used if the remote_admin_policy in local.yml is set to restrict.

Do not set this as a plain-text password! That is insecure and will not work. The password value must be set through the software to be hashed for secure storage.

drives

default: []

Example configuration:

drives:
- path: /mnt/data/documents
  rooms:
    - 4CM7XP733333
    - 4CM7YVQ33333
path: Local path in the filesystem that shall be the root for the drive
rooms: List of the rooms by their id that the drive should be attached to

The owner of the Nano can declare different system resources to be handled (that the Nano supports, for example drives). These drives need to be mapped to a room so the server can make it available to the desired audience.

For more freedom Nano allows mapping a drive to multiple resources. This is convenient in case different groups should have different default permissions on the drive, or the groups should not know about each other.

If the path of a drive becomes inaccessible, searches will still work for contents already indexed, but no other operations will succeed.

Warning

If multiple resources get the same room listed among different Nanos of the owner then those rooms will be in a competing-conflict state for the given resources. Nano will notify the owner if such a conflict state occurs.

room_blocks

default: []

Example configuration:

room_blocks:
  - 4CM7XP733333: ["2467z86822222:0GLPOdcixTh3W0e/j9TUfEPaXl398pvTvN/w12miXS8="]

These values are automatically updated to the latest known blocks by the Nano because verifying the authenticity of new blocks is secured by the owner account’s keyring with digital signatures.

All membership permissions are stored on the server secured by a digitally signed blockchain which only the owner is ever able to edit. The only malicious action a server would be able to do is to drop the top N block of the chain (lie of omission). The Nano can prevent even that by storing and requiring that the latest known block of the chain is always available. If the stored identifiers of a room are not present or seem to be invalid, the owner must re-attach the room to the Nano.

deny_anonymous

default: false

This setting provides a local policy override over the permission settings of rooms.

Permissions of a room’s config blockchain may indicate that the associated resources should be accessible by anonymous requests. For extra security the owner may set this to true, which will block any anonymous access from any room.

require_explicit_peer_trust

default: false

This setting provides a local policy enforcement of the trust level for peer accounts to be accepted for communication.

For extra security the owner may set this to true, which will disallow any peer account access unless their identity has been explicitly marked as trusted by the owner.

indexed_languages

default: []

The languages specifically supported for accurate indexing by the search database. Languages that are not specified will be indexed in a generic field with reduced accuracy. The more languages set up the more costly it will become to search.

In most expected cases primarily one or very few languages will be present in the indexed documents, so this is a great place to optimize the search times.

Valid arguments:
  • all: All supported languages. (not recommended if the quickest search times are important)

  • []: The system locale at the time of application and english

  • List of ISO language codes: hu, en, or de. Not supported language codes will be ignored. If none are supported then the config will be treated as if empty

Warning

Changing this value will cause all the indexes to be deleted and all content to be re-indexed.

indexer_sync_interval

default: 4.0

indexer_remove_lost_count

default: 3

indexer_force_sync_on_startup

default: true

Operating systems do not offer a system to reliably and robustly track changes on their filesystems. This will never be solved universally due to technical limitations or the performance/resource cost such properties would require.

The indexer process will use filesystem events where they are available, but even in such cases changes can be missed if the nano is not running when those occur. Even on a local filesystem, events from a path may not be generated if they involve symlinks.

Because of these issues a periodic polling synchronization must be used no matter the platform or underlying filesystem. The synchronization must also include checking if an indexed file still exists as their removal event could also be lost, if generated at all. Some network drives could become unavailable for an extended period of time, but might eventually come back again.

To handle such cases and avoid removing and re-indexing files all the time, indexed files will be marked as lost and will only be removed after they have been consecutively missing for a number of synchronizations. In addition to the consecutive missing count the removal will only take place if the first missing state was logged at least indexer_sync_interval multiplied by indexer_remove_lost_count long ago.

indexer_minimum_delay_between_path_sync

default: 0.0

The indexer processes and re-processes files for searching periodically and by filesystem events. This periodic reprocess work can be excessive at times, so throttling it is made possible here.

Two file syncing of any kind will have to wait this delay between each other at least, making the work more spread out in time. This may be the most useful to tweak when the same files are frequently and rapidly changed again and again.

This value is in seconds and it may be set to 5 at most. (Setting a higher value will default to 5)

indexer_reserved_space

default: 3000

The unit threshold in megabytes for the index storage. If the index storage falls below this threshold, the indexer will halt all processes until enough space is available for work to proceed.

indexer_ignored_mimes

default:
- application/javascript
- text/css

The indexer should not process all file types unconditionally. Many files can be considered very technical and unnecessary for indexing. For example if a user saves a web-page from the browser, many resource files will be saved aside from the HTML. Indexing these would not benefit the user in most circumstances and they would require language specific tokenizers for optimal matching.

indexer_extract_text_limit

default: 1500000 (almost 3 times the size of War and Peace)

Limit of the indexer for extracting text from a resource.

remote_files_restriction

default: true

The servers can attempt to set/use zone-info data to the files that are modified by remote requests. Such info is usually platform, filesystem and OS specific if available at all. They can provide some protection by preventing execution of untrusted files.

server_process_number

default: 1

The number of general server processes can be increased for Nanos that have a very high traffic to improve performance.

secrets

default: []

Encryption secrets are used for various steps. These are automatically generated and may be refreshed periodically. They can be replaced on demand by an admin-request. Modifying them manually is not possible, the last value is a fingerprint of the others and the local machine’s identity. If the configs are copied by some deployment system the new machines will automatically generate secure secrets.