Introduction to EnduraData Content Replication


Check for Updates of your file sync and replication software from www.enduradata.com

edintro



NAME

edintro - Introduction to EnduraData File Sync, Server Replication, Transfer and Content distribution

SYNOPSIS

EnduraData EDpCloud File Sync, File Sync, File Replication and File Transfer Suite (edpcloud)

USE CASES

The following are some use cases of EDpCloud file replication and synchronization software.

File sync and data replication from one site to another site
File sync and data replication from many sites to a single site
Distribution of data from one to many locations for workflow
Data migration between systems and geographic sites
Distribution of data from many locations to one location
Regular scheduled backup of data to local or/and remote sites
Regular scheduled backup of data to a portable drive
Dispersed work group data and file sharing
Workflow automation
Various repository file updates
Data archival, snapshots, data versions and recovery of old versions
Others, Feel free to contact EnduraData for any questions of additional uses

The solution allows you to sync, replicate and move data between geographic sites, operating systems and hardware platform. You can configure EDpCloud to distribute or to aggregate data.

EDpCloud FILE SYNC AND DATA REPLICATION DESCRIPTION

edpcloud allows you to synchronize, distribute and replicate content and files from one node to many nodes and from many nodes to one node using various configurations. It allows you full control on what to distribute and what to receive. A simple XML like configuration allows you to configure simple data replication or a global wide area data and content distribution. Using Regular expression and domain decomposition, you can have simple or elaborate configurations that best suite your enterprise file replication and automate your data flow between systems and processes.

The following sections describe the commands and refer you directly to the documents that explain how to use edpcloud.

FILE SYNC AND REPLICATION TERMINOLOGY

edpcloud uses the notion of file sender/receiver for its content distribution. Content is sent from a sender system to one or more receivers systems. Another notion that is used is that of a link. A link is a logical grouping of one sender and one or many receivers. A configuration has one or many links in it. A link has one sender and one or more receivers. The configuration resides on both the sender and the receiver.

A PC or server can be a file sender, a receiver or both.

See the manual page for eddist.cfg for more information.

INSTALLATION AND CONFIGURATION

edpcloud file replication and synchronization software is packaged in a simple tar file for Unix/Linux and Mac and in a setup.exe for Windows. Once you install the software, The install application sets up your environment. The root directory for edpcloud is pointed to by the environment variable ED_BASE_DIR.

This directory has the following subdirectories

etc

This sub-directory includes various configuration files such as eddist.cfg and edpasswd

eddist.cfg contains the master configuration. edpasswd contains the console management password.

etc/certs

This is where you put your SSL certificates.

doc

This directory has documentation in pdf, man and html formats.

data

This subdirectory contains the data journals. It must have available space to accommodate the growth of the journals especially in the case where communication links may fail for longer periods of time. The used disk space by the journals is a function of the number of links/receivers and the number of files that will be journaled.

tmp

This is the temporary space used for transient storage. The available disk space for tmp should be at least twice as big as the largest file that will be replicated.

logs

This is where the application logs are kept. Any errors encountered by the application will be logged in eddist*.log, ed_sender*.log or ed_receiver*.log.

COMMAND LINE INTERFACE

edpcloud comes with several commands. More information can be found in the doc directory or in the man pages. The following is a list of some commands:

edq

This command allows you to initiate content distribution. You can run this command manually, from a cron (UNIX/LINUX/MAC) or from a task scheduler (WINDOWS).

edmfq

edmfq is an advanced version of edq. It allows you to queue files and directories by modification times and by file age. edmfq allows users to queue files based on many criteria such as file sizes, etc.

edstat

This command allows you to obtain statistics about the progress of the data distribution

edpause

This command allows you to pause data distribution to one or more receivers/links. It can also be used to pause the file system change queueing without loosing the file changes. Use this if you want to restart eddist.

edresume

This command allows you to resume data distribution to one or more receivers/links.

edcmpf

This command allows you to compare data between a source and destination directory.

edconfig

This command allows you to use a user interface to configure data distribution. This command is being phased out and is being replaced by a browser based configuration. You can browse to https://localhost or https://hostname to configure. You will need php version 5.3.6 or higher.

edstopall

This command allows Windows users to stop services. This stops and restarts windows services.

edstopall

This command allows Windows users to stop windows services.

edpcloud.sh startall|stopall

This command allows Unix users to start or stop all services.

edverify

This command allows you to verify if a configuration is valid before applying it.

edreport

This command allows you to obtain a technical report for troubleshooting or in case you need to contact EnduraData technical support.

edresolv

This command allows you to obtain information about advertised addresses and hostnames for the local node.

Requirements for Using the Graphical User Interface

If you have purchased a copy of edpcloud GUI, this section applies to you. In order to use edpcloud GUI, you need to install Java Virtual Machine from sun microsystems. Please download using this link: https://www.java.com/download/index.jsp

Changing ports and Dealing with firewalls

Make sure you configure your firewall to allow access to the two ports (9000, 8888) that are used by EnduraData Content Distribution. You may change these two ports by setting the following two environment variables ( Windows users can use edparams to configure the ports and other parameters in the registry. Advanced users can simply use the regedit).

Changing nginx.conf to allow replication to be managed from specific IPs.

Take a look at nginx.conf under edpcloud/edpwebgui subdirectories to see how you can use allow and deny directives to restrict access to the file replication server. Make sure you reload the configuration by restarting the server software after you edit nginx.conf.

ED_COLLECTOR_PORT :

This environment variable is set to the port on which EnduraData eddist server listens to management commands. Set this environment variable to the port on which to listen for commands.

ED_DISTRIBUTION_PORT :

This environment variable is set to the port on which EnduraData eddist server listens to data distribution services. Set this environment variable to the port on which to listen for data distribution connections.

Windows users can also set the ports and other parameters in the registry running edparams.exe

Running under Windows

Once you have installed and configured edpcloud, you will need to start the services manually

You can start the services using edstart You can restart the services using edrestart

These services are configured to start automatically after your computer reboots but you may start or stop them using edrestart or edstop respectively.

WINDOWS 7 Users will need to use windows Service manager to start/stop the services.

All windows platforms must configure enduradata services to restart an infinite number of times, automatically after a failure.

Running under Unix/Linux/Mac

Unix users can manually invoke " . $ED_BASE_DIR/bin/edpcloud.sh startall " from their startup environment or by invoking "$ED_BASE_DIR/bin/edpcloud.sh startall|stopall|restartall"

Managing Content Distribution

A few commands allow you to manage data replication and content distribution.

edpause : Allows you to pause replication and content distribution. See edpause manual page for more information.
edresume : Allows you to resume replication and content distribution. See edresume manual page for more information.
edq : Allows you to queue files and directories for data replication and content distribution. See edq manual page for more information.
edstat : Allows you get statistics and status. See edstat manual page for more information.

FILES

$ED_BASE_DIR/etc/eddist.cfg

$ED_BASE_DIR/etc/edpasswd

ENVIRONMENT

ED_BASE_DIR: This is the name of the top directory where the content distribution software is installed. This directory will also contain various configuration, journals and log files.

ED_COLLECTOR_PORT: This environment variable is set to the port on which EnduraData eddist server listens to management commands. Set this environment variable to the port on which to listen for commands.

ED_DISTRIBUTION_PORT: This environment variable is set to the port on which EnduraData eddist server listens to data distribution services. Set this environment variable to the port on which to listen for data distribution connections.

ED_CREATE_STOREPATH: This environment variable if set tells edpcloud to automatically create a non existing storage path.

ED_APPLY_CONFIG: This environment variable if set tells edpcloud to automatically apply a new configuration.

TECHNICAL SUPPORT

Users with a support contract or that are evaluating the solution can contact EnduraData Technical support at <support@enduradata.com>

AUTHOR

A. A. El Haddi, elhaddi@ieee.org