2011-04-05 Dejan Muhamedagic Medium: oracle: improve oracle process list test (bnc#673027) 2011-04-05 ultrabug@gentoo.org Low: build: create symlinks via autotools 2011-04-05 Fabio M. Di Nitto remove slang resource scripts that really belong with rgmanager 2011-04-04 Florian Haas Low: heartbeat/VirtualDomain: fix a few whitespace errors 2011-04-04 Andreas Kurz Low: VirtualDomain: fixed error handling on stop Over time the libvirt error messages changed slightly. To recognize an already stopped guest during shutdown escalation the error message of "virsh destroy" is interpreted. This fix allows it to work with versions of libvirt < 0.7.x 2011-04-01 Fabio M. Di Nitto Merge branch 'master' of https://github.com/dmuhamedagic/resource-agents into dmuhamedagic-master 2011-03-31 Holger Teutsch build: resource-agents.spec: add --with-ras-set to configure options GNUmakefile commit 5e845734ca77336881b361b622ea8ae86cb092c4 Author: Holger Teutsch Date: Mon Mar 14 17:00:27 2011 +0100 build: resource-agents.spec: add --with-ras-set to configure options Patch: Change from hg to git in agent's GNUmakefile commit 02911a66d4f509feba99429ca3dddfbb2630eae2 Author: Holger Teutsch Date: Sun Mar 13 13:52:29 2011 +0100 Medium: GNUmakefile: Change hg commands to git commands 2011-03-30 Lon Hohberger resource-agents: Trim trailing slash for nfs clients The exportfs command and/or rpc.mountd trim the trailing slashes when reporting things in /var/lib/nfs/etab, causing mismatch problems for the nfsclient resource agent. Resolves: rhbz#592624 2011-03-28 Marek 'marx' Grac rgmanager: Fix problems in generated XML config file for tomcat5 Add support for XML files to resource agents. Resolves: rhbz#637802 2011-03-25 Pavel Levshin Medium: VirtualDomain: properly wait until $domain_name is non-empty 2011-03-22 Lon Hohberger resource-agents: Use shutdown immediate in oracledb.sh Resolves: rhbz#633992 2011-03-02 Fabio M. Di Nitto Merge branch 'master', remote-tracking branch 'lha/master' 2011-03-01 Fabio M. Di Nitto Fix build with recent gcc being too paranoid drop more redudant bits after merge Fix copyright headers for all files imported from rgmanager ras Drop another required file by merge Merge Author files doc: move README.webapps in the linux-ha section where it belongs remove duplicated licence files 2011-02-25 Lars Ellenberg Repository has moved to https://github.com/ClusterLabs/resource-agents git://github.com/ClusterLabs/resource-agents.git 2011-02-22 Fabio M. Di Nitto build: drop -O0 in default CFLAGS build: relax automake requirements and drop dist-xz allow SLE11 SP1 to build the source build: fix automake version check and options add --with-ras-set option to exclude/include rgmanager/linux-ha agents 2011-02-21 Fabio M. Di Nitto Restore BUILD_VERSION via git attributes 2011-02-20 Holger Teutsch Doc: ra2refentry.xsl: Add preservation of empty lines to longdesc of parameters as well --HG-- extra : rebase_source : 050789f518eceae58d3cd91697befbbbfbe69acf 2011-02-18 Holger Teutsch Doc: ra2refentry.xsl: last line of description did not get a after last patch Remove "if" that got redundant after last patch. Improve readability of generated xml code --HG-- extra : rebase_source : 4a1fe87cb9177646959e95f02d2705e0fc908caa 2011-02-22 Holger Teutsch Low: Tools: ocft: db2: Fix typo Somehow the "1" got lost during my submission process --HG-- extra : rebase_source : f86cd3bbbcba78e2dc493fd79e94c219c6f01dfb 2011-02-22 Andreas Kurz Medium: VirtualDomain: correctly create migration URI when target is an FQDN Some machines (notably Fedora/RHEL/CentOS systems) use an FQDN as their default node name. On those, create the migration URI by injecting the migration suffix after the unqualified host name, but before the domain name. Thus, a migration target of "alice.example.com" with a migration suffix of "-x" now creates a migration URI with "alice-x.example.com" rather than "alice.example.com-x". 2011-02-17 Fabio M. Di Nitto Add README to tarball Merge .hgignore into .gitignore Drop .hgtags Fix make distcheck again after merge Merge branch 'master' of github.com:ClusterLabs/agents-lha build: drop concept of BUILD_VERSION this removes also usage of HG (since we switch to git) and use VERSION that exports the same values as the previous B_VER implementation Merge build systems This commit will allow to do releases too and fix make distcheck Fix make distcheck in ldirectord/Makefile.am README appears to have vanished in the void Fix make maintainer-clean target Fix file permissions Merge branch 'master' of http://git.fedorahosted.org/git/resource-agents 2011-02-16 Dejan Muhamedagic Add tag agents-1.0.4 for changeset 8a469febdcca build,doc: create manpage for nginx build: update ChangeLog and set release to 1.0.4 Medium: Tools: ocft: testcase for LVM Medium: LVM: exit with proper codes on bad configuration 2011-02-15 Dejan Muhamedagic Medium: Tools: ocft: testcase for Filesystem Medium: Filesystem: exit with proper codes on bad configuration 2011-02-15 Holger Teutsch Medium: Tools: ocft: testcase for db2 2011-02-15 andrew@beekhof.net Low: Filesystem: cman can run ocfs2 2011-02-14 Dominik Klein Low: conntrackd: trivial spelling fixes 2011-02-14 Holger Teutsch Tools: ocft: test monitor instead of status in apache and mysql --HG-- extra : rebase_source : 3e95d317a315165d3906fa536e8daf20213b1a53 2011-02-14 Andreas Kurz Medium: VirtualDomain: support specifying a dedicated migration network Add a facility for setting a dedicated migration network. People typically don't like to tax their switched network with traffic associated with domain live migrations. This approach introduces a new parameter, "migration_network_suffix" which when set, appends the given suffix to the migration target's host name. That hostname is then expected to resolve, via an /etc/hosts or DNS entry, to the peer node's IP address in the migration subnet. 2011-02-21 Florian Haas Low: doc: fix "xml" target Doesn't hurt to include the generated DocBook appendix in the xml target, on top of the man pages. 2011-02-16 Michael Prokop Low: Tools: ocf-tester: redirect error messages to stderr The --help option isn't handled in the command line parsing and therefore doesn't have the same option handling as the -h option. As a result do not mention the --help option in the usage information. Thanks: Lars Ellenberg for review and suggestion for improvement 2011-02-10 Florian Haas Doc: interpret a double linefeed as a paragraph break Previous patch from Holger created slightly ugly man page output when RA longdescs had their own "fill-paragraph" line breaks inserted by editors. Rather than splitting paragraphs on single linefeeds, do so on double linefeeds. This is the same convention that AsciiDoc uses. --HG-- extra : rebase_source : d7b97d995e2b44cdd4315c1853a416ecb0e9a245 2011-02-10 Holger Teutsch Doc: ra2refentry.xsl: Improve legibility of generated man pages Preserve line breaks in "Description" section. --HG-- extra : rebase_source : 4b145a22be41486b6950b3144289d13833c801d1 2011-02-11 Dejan Muhamedagic Add tag agents-1.0.4-rc for changeset 526023023aec build: update ChangeLog and set release to 1.0.4rc 2011-02-11 Florian Haas Low: conntrackd: fix up for new lib directory Low: conntrackd: honor $PATH in "binary" default 2011-02-11 Dominik Klein High: conntrackd: extensive cleanup after mailing list review 2011-02-07 Marek 'marx' Grac resource-agents: Remove netmask from IP address when creating list of them IP address were copied directly to configuration of various resource agents. Resolves: rhbz#667217 resource-agents: Apache resource with spaces in name fails to start Resolve: rhbz#667222 2011-02-03 Fabio M. Di Nitto fs-lib: fix do_monitor device mapping do_monitor needs to expand _device key to the real device before performing checks Resolves: rhbz#669832 2011-02-01 Andrew Beekhof Merge OCF agents from the Red Hat and Linux-HA projects 2011-02-01 Lon Hohberger resource-agents: Use literal quotes for tr calls This affects SAPInstance / SAPDatabase agents. Resolves: rhbz#639252 Reviewed-by: Ryan O'Hara 2011-01-28 Marek 'marx' Grac resource-agents: Add option disable_rdisc to ip.sh rdisc is called by ip.sh which causes static routes to be updated by dynamic ones. This option adds possibility to disable this feature. Resolves: rhbz#621538 2011-01-25 Lon Hohberger resource-agents: Improve LD_LIBRARY_PATH handling by SAP* This is a backport from the Heartbeat resource agents repository. Author: Dejan Muhamedagic (dejan at hello-penguin com) http://hg.linux-ha.org/agents/rev/2773e5850003 2011-01-24 Alexander Krauth Low: SAPInstance,SAPDatabase: Allow blanks in path of userexit script 2011-01-26 Andres Rodriguez Low: proftpd,sfex: fix spelling errors in meta-data 2011-01-26 Brett Delle Grazie Medium: tomcat: Use Tomcat stop TIMEOUT -force to improve stop The tomcat stop script can be told to forcefully terminate tomcat if it doesn't shut down nicely within a specified period. Using this reduces the stop case to almost a simple 'call tomcat stop script in blocking mode'. The timeout is set to one second shorter than the stop operation timeout. The tomcat stop script checks for and uses the PID file. --- heartbeat/tomcat | 59 +++++++++++------------------------------------------ 1 files changed, 13 insertions(+), 46 deletions(-) 2011-01-22 Holger Teutsch Low: Dummy: make method reload work (carry forward parameter "fake" from pacemaker/Dummy) The resource needs a non-unique parameter for reload to be triggered. 2011-01-19 Holger Teutsch Low: Dummy: migrate_from/to: correct OCF_RESKEY_CRM_meta_migrate_xxx variable names 2011-01-18 Brett Delle Grazie Medium: tomcat: Add CATALINA_BASE parameter, defaults to CATALINA_HOME, permits multiple tomcat instances By exposing the CATALINA_BASE parameter it is possible to have multiple instances of Tomcat using the same binaries but with each instance having its own configuration, applications etc. Existing behaviour is preserved as CATALINA_BASE defaults to CATALINA_HOME which is the default. --- heartbeat/tomcat | 14 +++++++++++++- 1 files changed, 13 insertions(+), 1 deletions(-) Low: tomcat: remove eval of empty variable - does nothing. Looks like a left-over from a possible 'execute this on stop' hook. The variable is not used anywhere else in the file, therefore removing it. --- heartbeat/tomcat | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) Low: tomcat: Ensure name of tomcat resource is only used on start operation and expose JAVA_OPTS variable for use A specific string (the 'name') of the tomcat resource agent is added as a define to the JVM. This is used with a grep against the process table in the monitor operation to determine if the process is still running. This patch ensures the string is only added to the process on the start operation. This patch also exposes the JAVA_OPTS variable for use in resource definitions. --- heartbeat/tomcat | 19 ++++++++++++++----- 1 files changed, 14 insertions(+), 5 deletions(-) Low: tomcat: Fix to ensure default OCF_RESKEY_xx values are observed Use the internal name of the OCF_RESKEY_xx variables throughout ensuring that any defaults set at the beginning are observed. --- heartbeat/tomcat | 16 ++++++++-------- 1 files changed, 8 insertions(+), 8 deletions(-) Low: tomcat: Use here-documents to simplify start/stop operations --- heartbeat/tomcat | 30 +++++++++++++++--------------- 1 files changed, 15 insertions(+), 15 deletions(-) 2011-01-18 Dejan Muhamedagic Medium: Xen: implement stop of a migrating domain (bnc#656227) The domain name of a migrating domain changes from "VM" to "migrating-VM". In case a stop action is ordered in this situation, VM won't be found and hence considered already stopped. Now the RA checks also if there's the "migrating-VM" and stops that. 2011-01-18 Lars Marowsky-Bree Build: Add GPLv3 license file (tickle_tcp requires this) 2011-01-14 Alexander Krauth Low: SAPDatabase: remove unnecessary usage of eval to start processes Low: SAPInstance: remove unnecessary usage of eval to start processes Medium: SAPDatabase: Avoid continuous output to syslog in monitor with SAP 7.20 and J2EE_ONLY=1 2011-01-17 Dejan Muhamedagic Dev: iscsi: portal is a global variable, remove the local statement 2011-01-14 Dejan Muhamedagic Low: jboss: remove extra ;; in the main case Low: Tools: ocf_tester: set and export some common meta variables (lf#2524) Low: iscsi: iscsiadm errors on probe are interpreted as OCF_NOT_RUNNING 2011-01-11 Alexander Krauth Medium: SAPDatabase: start listener only if database processes are found 2010-12-30 Alexander Krauth High: SAPInstance: Fixed monitor_clone function to ensure enqueue failover, in case of process (not host) failure RAs in versions <= 2.01 used a Heartbeat 2.0 specific feature to distinquish, if running in master or slave mode. This is not working with Pacemaker anymore. Since RA version 2.02 (not in official release) the monitor_clone function is damaged for the case of a local failure of the Standalone Enqueue process. This patch follows the requirement, that the RA must know be itself, if it is running in master or slave mode. Also it ensures, that always the slave (Enqueue Replication Server) gets promoted, if the master (Standalone Enqueue Server) fails. 2010-12-29 Alexander Krauth High: SAPInstance: New parameter: SHUTDOWN_METHOD High: SAPInstance: Improved inline documentation Low: SAPInstance: Return errors for promote and demote action, if not called in a clone environment (ocf-tester) Low: SAPInstance: Eliminate hostname command call for default parameter of sapcontrol sapcontrol executes via default on the host where it is called. So we can omit the call of the hostname command. Low: SAPInstance: Make more use of ocf-shellfuncs where possible High: SAPInstance: Fix some returncodes in case of probe and monitor actions High: SAPDatabase: Adapt process search pattern for DB/2 9.5 High: SAPdatabase: Changed Oracle recovery method from "recover automatic database" to "end backup" High: SAPDatabase: Fixed wrong scope of rc variable in service_start/stop functions 2010-12-20 Alexander Krauth High: SAPDatabase: prevent premature expansion of [:upper:] and [:lower:] when producing sidadm/orasid/db2sid uids Not quoting the parameters of the tr command sometimes let to shell replacements, if certain files exist. High: SAPInstance: Moved testing of SAP profile directory and START profile to a later stage (only when needed), for more robustness Very often the SAP profile directory is located on NFS. Not in any case it is required to access the SAP profile directory. It is not required during a monitor action, if sapstartsrv is up and responding. In that situation a stale NFS mount will not cause hanging the monitor action anymore. High: SAPInstance: prevent premature expansion of [:upper:] [:lower:] when producing sidadm uid 2010-12-21 Dejan Muhamedagic Low: portblock: update exit codes on bad configuration Low: nfsserver: return ERR_CONFIGURED on bad configuration Low: MailTo: return ERR_GENERIC on bad configuration Low: Tools: ocft: update mysql configuration 2010-12-17 Dejan Muhamedagic Low: mysql: set default for shutdown timeout in case meta_timeout is not set 2010-12-14 Alan Robertson High: nginx: new RA 2010-12-13 Dejan Muhamedagic Low: Tools: findif: exit on error also when there was no error message set Medium: IPsrcaddr: exit with the right code when not properly configured Medium: IPaddr2: exit with the right code when not properly configured Medium: Tools: findif: differentiate between different error conditions 2010-12-07 Florian Haas Low: iSCSITarget: emphasize that incoming_username is unique (Novell 631886) --HG-- extra : rebase_source : 9733e31b9863c8e39769d13f77af8af6d8826b02 2010-12-06 Lon Hohberger resource-agents: Fix migrateuriopt setting When a user was specifically setting migration_uri (for example, to get around ssh banners causing migration to fail), vm.sh was leaving the migrateuriopt variable unset when using QEMU/KVM. This caused the printf() during command line generation to be incorrect. This means the generated command line looked like this: virsh migrate --live vm1 \ qemu+ssh://node1.example.com/system?command=/bin/quiet_ssh.sh node1.example.com Instead of: virsh migrate --live vm1 \ qemu+ssh://node1.example.com/system?command=/bin/quiet_ssh.sh tcp:node1.example.com ^^^^ Resolves: rhbz#659477 2010-11-18 Florian Haas Low: Dummy: fix typo in default --HG-- extra : rebase_source : b6714441d150b7f0917ae26ff254325c6da19917 2010-12-07 Dejan Muhamedagic Medium: Filesystem: add fast_stop parameter (lf#2402) 2010-12-07 Xinwei Hu Medium: LVM: add partial_activation parameter (lf#2490) 2010-12-01 NAKAHIRA Kazutomo Medium: sfex: add the sfex_stat command 2010-12-02 NAKAHIRA Kazutomo build: install jboss 2010-11-30 NAKAHIRA Kazutomo Medium: sfex: output log messages also to stderr in sfex_init 2010-11-30 Tim Serong High: CTDB: Don't manage Samba and Winbind by default * Add many more optional parameters: - ctdb_manages_samba (default: no) - ctdb_manages_winbind (default: no) - ctdb_service_{smb,nmb,winbind} - ctdb_samba_skip_share_check - ctdb_monitor_free_memory - ctdb_start_as_disabled - smb_passdb_backend - smb_idmap_backend * Change backup location of CTDB sysconfig, don't auto-restore * Automatically disable most CTDB event scripts * Don't auto-generate Samba config if CTDB not managing Samba 2010-11-30 Sebastiaan Hoogeveen medium: ldirectord: precedence error with perl v5.8.8 There seems to be a minor bug in the ldirectord snapshots starting august 26th, related to a precedence ambiguity in the IPv6 patches submitted by Sohgo Takeuchi. My perl 5.8.8 installation would yield a number of syntax errors including: syntax error at ldirectord.2010-10-14.6efae155209e line 1533, near "$1 =~ /\"?(" (Might be a runaway multi-line ?? string starting on line 1371) It appears that the ?: operator which is used on lines 1371 and 1539 has a precedence problem. The code compiles fine when replacing occurences in these lines of: $af == AF_INET ? "" : "6" with: ($af == AF_INET) ? "" : "6" I assume the problem is related to a different definition of AF_INET in different versions of the perl Socket-library, possibly causing the problem not to be reproducible for all versions of perl and Socket.pm. If I should report this minor issue any other way, please let me know. 2010-11-22 Dejan Muhamedagic build: no tickle where no struct iphdr 2010-11-15 NAKAHIRA Kazutomo Low: sfex: revise timeout values 2010-11-09 Holger Teutsch Medium: db2: add multi partition support 2010-11-09 Fabio M. Di Nitto build: allow autoreconf from tarball release 2010-11-09 Simon Horman Medium: ldirectord: Restore Provides and Conflicts heartbeat-ldirectord (LF2472) This brings the code in this respository into line with the directives suggested by Masashi Yamaguchi[1] and tested by David Warden[2]. [1] http://www.mail-archive.com/linux-ha@lists.linux-ha.org/msg17174.html [2] http://www.mail-archive.com/linux-ha@lists.linux-ha.org/msg17235.html0 2010-11-08 Fabio M. Di Nitto build: enable publishing build: fix script permission build: add automatic versioning and release script 2010-11-05 Fabio M. Di Nitto overall cleanup and make ready for release 2010-11-05 Holger Teutsch Medium: db2: guard against a hanging db2stop by spawning this into the background. Use db2_kill after grace period. 2010-11-08 Serge Dubrouski Low: pgsql: don't use fuser in status 2010-11-04 Holger Teutsch Low: db2: don't let the start method fail when a client connects faster to the db than we activate the db SQL1494W Activate database is successful, however, there is already a connection to the database is success and not failure 2010-11-02 RaSca Medium: anything: add the workdir parameter --HG-- extra : rebase_source : 1a0e362ee90063d03bbcf9cbbd4fdacb034bfc17 2010-10-27 Lon Hohberger resource-agents: Add multi-instance Oracle database agents Resolves: rhbz#629208 2010-10-27 Dominik Klein High: new RA: conntrackd 2011-02-09 Dejan Muhamedagic Dev: add OCF_ROOT/lib/heartbeat directory - ocf-shellfuncs and others are moved to OCF_ROOT/lib/heartbeat - there are still compatibility links in OCF_ROOT/resource.d/heartbeat - other code to be sourced by resource agents can live now in the new directory - tested with heartbeat RA as well as other providers which are still sourcing the old .ocf-shellfuncs - old compatibility scripts /usr/lib/heartbeat/ocf-shellfuncs and ocf-returncodes removed, they've been there since 2007 (there'll be appropriate text in the release announcement and the changelog) - hg log --follow ocf-shellfuncs.in won't work, use hg log .ocf-shellfuncs.in to get the history :( --HG-- rename : heartbeat/.ocf-binaries.in => heartbeat/ocf-binaries.in rename : heartbeat/.ocf-directories.in => heartbeat/ocf-directories.in rename : heartbeat/.ocf-returncodes => heartbeat/ocf-returncodes 2011-02-02 Lars Marowsky-Bree Low: IPaddr2: Fix reference to Infiniband arping binary (bnc#668447) 2010-10-21 Florian Haas Low: ManageVE: add proper license information, bump version number, update documentation --HG-- extra : rebase_source : 4ab6a9e29656638c038c0a57d077dcaf2df4b89e 2010-10-19 Florian Haas Low: ManageVE: clean up stop action Invoke status before stop. Use ocf_run. --HG-- extra : rebase_source : 2bd46fe66e46d1a5dbfd753f058e455082579a34 Low: ManageVE: clean up start action Invoke status before start, this removes the need for checking for the 32 exit code (VE already running) on start. Use ocf_run. --HG-- extra : rebase_source : 80c0a62982fc18710a6c100e2da1994471237526 2010-10-18 Lon Hohberger Remove antiquated, unused XSLT scripts 2010-10-18 Florian Haas Medium: ManageVE: add migration capability OpenVZ does not support live migration, but it supports checkpoint/restore. Which is close enough. --HG-- extra : rebase_source : a915cfa8ee9ddb9ee28f03c30978e6526779236d 2010-10-19 Florian Haas Low: ManageVE: exit, don't return, on error --HG-- extra : rebase_source : 4ffbe87138b22c51dbb39fe8bb35310b27c96ccc 2010-10-27 Florian Haas Medium: exportfs: correctly test for lease time file Thanks to Lars Ellenberg for spotting this. 2010-10-16 Lon Hohberger rgmanager: Add XSLT scripts for extracting RA metadata resource-agents: Clean up recursion and documentation Resolves: rhbz#606470 resource-agents: Add missing resource docs Resolves: rhbz#606470 2010-10-16 Marek 'marx' Grac Revert "config: Add missing resource docs to cluster.rng" This reverts commit 64c6eca0cecb66050ad614236535ee9ca1fa7eff. 2010-10-16 Lon Hohberger rgmanager: Add failure tolerances to resources.rng resource-agents: Add missing resource docs Resolves: rhbz#606470 2010-10-16 Marek 'marx' Grac config: Update config schema Part for resource-agents was included only. Resolves: bz583789 2010-10-15 Florian Haas High: exportfs: cleanup Give the exportfs RA a hearty makeover: * Do away with the odd background "backup" job. * Stop copying a filtered rmtab into the root of the export. * Don't specifically look into the etab to figure out what is currently exported, just ask exportfs and parse its output. * Actually check whether the export already (or still) exists during start and stop. Low: exportfs: run exportfs with -v option Invoke the exportfs binary with the -v (verbose) option, so ocf_run can log something useful. Medium: exportfs: ensure graceful failover with NFSv4 clients 2010-10-07 Marek 'marx' Grac rgmanager: Merge resource schema generation bits rgmanager: Support convalesce w/ central_processing rgmanager: Initial commit of central proc + migration support This adds preliminary migration support of virtual machines when using central processing. 2010-10-07 Lon Hohberger resource-agents: Stop using '-' as 1st char of log messages Resolves: rhbz#633856 2010-10-07 Marek 'marx' Grac rgmanager: Fix event generation with central_processing This patch fixes event generation and processing when a node dies. Effectively, what was happening is that when a node failed and was fenced, no events for the dead services on that host were generated. This led to dependent services not restarting correctly in many cases. Resolves: rhbz#523999 2010-10-07 Carlos Eduardo Maiolino resource-agents: Remove nfs service temp directories Resolves: rhbz#595455 2010-10-07 Marek 'marx' Grac rgmanager: Fix relocation & migration errors If you relocate a service but in the end, it ends up on the same node, the error message was "Failure". While technically correct because the relocation failed, there really is no reason to not have an error to indicate the condition that the service is still running. Furthermore, during migration, if a migration had a non-critical failure causing the migration to fail but leaving a virtual machine running on the original owner, there was no method to detect this particular condition. 2010-10-01 Marek 'marx' Grac resource-agents: fix utility to obtain data from ccs_tool Resolves: rhbz#631943 2010-09-30 Marek 'marx' Grac resource-agents: tomcat-5 can't start properly Error was introduced when switching from sudo to su Resolves: rhbz#591003 2010-09-27 Florian Haas Medium: exportfs: add unlock_on_stop parameter When the resource stops, it's a good idea to relinquish all locks which the NFS server holds on the exported file system. On Linux, the NFS server can be forced to do this by echoing the affected path into the /proc/nfs/nfsd/unlock_filesystem virtual file. Leave this parameter disabled by default for the following reasons: - POLS. Don't introduce a new default since people are already using this RA. - The relevant virtual file may not be present on legacy kernels. - The resource may be running on an OS other than Linux. 2010-10-14 Simon Horman ldirectord: low: Receive string must always start with "/" There are two locations where the receive string can be set. One corresponding with the real=... configuration directive and one corresponding with the request=... directive. Unfortunately the parsing of these was slightly inconsistent which lead to the truncation of the first letter of the request string when the real=... syntax is used in conjunction with the simpletcp check. Reported-by: Niles Ingalls 2010-10-14 Sohgo Takeuchi ldirectord: medium: Shutdown write-side of client connection after writing has finished Tested-by: Niles Ingalls 2010-10-11 renayama19661014@ybb.ne.jp Low: pgsql: reduce severity in monitor on when probing Dev: shellfuncs: allow ocf_run log at different severities 2010-09-27 Florian Haas exportfs: change fsid parameter to string and expand parameter description fsid need not be integer, but can also be a UUID or the string "root". Add extended parameter description. 2010-09-30 Florian Haas Medium: IPaddr2: optionally flush kernel routing table on interface stop This is an updated version which uses the generic "ip route flush cache" command instead of relying on the existence of a specific sysctl. 2010-10-06 Serge Dubrouski Medium: pgsql: add optional username, password, and sqlcode parameters for monitor 2010-10-06 Dejan Muhamedagic Low: EvmsSCC: rely on the exit code of evms_activate in start notify 2010-09-29 Tim Serong Medium: CTDB: Remove hard-coded timeout on start op 2010-09-27 Dejan Muhamedagic Medium: Filesystem: allow cloning of some filesystems as read-only (thanks to Matthew Richardson) (lf#2440) 2010-09-24 Lars Ellenberg Dev: pgsql: use $* instead of $@ for su 2010-09-22 Serge Dubrouski Low: pgsql: cd to pgdata before running commands 2010-09-22 renayama19661014@ybb.ne.jp Low: pgsql: issue just a warning if the config file is missing during a probe 2010-09-22 Serge Dubrouski Medium: ocf-shellfuncs: allow ocf_run to return the actual exit code 2010-09-21 Simon Horman low: ldirectord: fix spelling errors in ldirectord.cf Medium: ldirectord: example configuration for a submission virtual service medium: ldirectord: add implicit support for submission RFC4409 This is really just an alias for smtp on port 587 2010-09-21 Lars Marowsky-Bree Low: MailTo: email address might be an alias for which no local users exists --HG-- extra : rebase_source : 06bacebf655cca896c96620c56ece6d24d299af9 2010-09-01 Lars Marowsky-Bree Xen: Simplify xen-destroy detection. --HG-- extra : rebase_source : dc803a91e68773c0e4f71a7e36819db4d32648d5 2010-09-20 Sohgo Takeuchi low: ldirectord.cf: added an example config of IPv6 2010-09-19 Dejan Muhamedagic Low: apache: fix a few typos in the meta-data Low: SAPDatabase,SAPInstance: improve LD_LIBRARY_PATH processing (bnc#640026) 2010-09-16 Holger Teutsch High: db2: Replace call to db2_local_ps with db2nps (Thanx to Evgeny Nifontov) db2nodes.cfg typically contains a cluster service address and db2_local_ps never works in this case. 2010-09-09 Dejan Muhamedagic Low: Xen: set the proper type for shutdown_timeout in the meta-data and check its value (bnc#637525) Medium: Xen: check the allow_mem_management boolean properly (bnc#637525) 2010-08-31 Edward Z. Yang High: ldirectord: http: connect to server instead of protocol Debian#594958 Our LVS setup uses ldirectord's real configuration option to check if servers were alive. real=18.181.0.53 gate "heartbeat/services", "1" Upon upgrading to the latest version of ldirectord on lenny-backports, we noticed that this code was not working (and we correspondingly did not have a primary). Debugging the ldirectord process revealed that it was improperly making an HTTP request with an invalid host field of 'http'. Checking the source code, we found the following bug in the regex for check_http: See http://bugs.debian.org/#594958 2010-08-30 Lars Marowsky-Bree Low: Xen: Always run destroy in stop sequence. Low: Xen: use xen-destroy on stop, if available; and force stop harder. Xen: small cleanup to xlist invocation Low: Xen: use xlist query command for status check if available (bnc#628735) Low: Xen: Allow node configurable attribute to specify which IP to use for live migration (bnc#628735) 2010-08-27 Dejan Muhamedagic Low: IPaddr: add -n to route in stop to prevent name resolution (thanks to Alain Moulle) 2010-08-26 Sohgo Takeuchi Medium: ldirectord: Enhance IPv6 support - Added new directives: virtual6, real6, fallback6. These are used to specify IPv6 as addresses or hostnames. Now cannot specify an IPv6 address in a "virtual" and "real" line (This breaks backwards compatibility). - Supported following checktypes and services. checktype: connect, external, external-perl, negotiate, off, on, checktimeoutN service: dns, nntp, none, simpletcp, sip - firewall-mark + IPv6 works. 2010-08-25 japc Low: IPaddr2: add missing 5 to the validation string for the multicast MAC 2010-08-24 Dejan Muhamedagic Low: ocf-shellfuncs: variables local to functions should have local scope It seems like that somehow there wasn't any variable name clash so far, but if it ever happens it would be difficult to trace. Low: apache: stop apache if monitor fails in the start action 2010-08-23 Serge Dubrouski Low: pgsql: postpone getting the socketdir parameter which depends on the configuration (suppresses error in the meta-data action) Low: pgsql: suppress irrelevant output in the monitor operation 2010-08-19 Marek Marczykowski Medium: mysql: if appropriate, delete slave config on start A node might be returning to the cluster after a crash and be configured as a replication slave. When it returns and there is no master, we should clear its slave configuration (i.e. change its master to none). --HG-- extra : rebase_source : 95b1aef8642699f000bd7e4f1867c3d8b013a8f8 Medium: mysql: if appropriate, connect to master on start If a new node joins the cluster, a master instance may already be present. When that is the case, we should start replicating immediately. Modify start operation so it starts MySQL replication if a master_uname is available. --HG-- extra : rebase_source : 05b9e5896ea8725fb25659413eb356eee87a1e2f 2010-08-20 Dejan Muhamedagic Low: ocf-shellfuncs: add option -q to ocf_run to suppress verbose logging --HG-- extra : rebase_source : d566dd91fbd27b53a38435b9ec953c9bf6982624 2010-08-20 Simon Horman Medium: ldirectord: Remove Provides and Conflicts heartbeat-ldirectord (LF2472) It seems that Obseletes heartbeat-ldirectord is sufficient to upgrade heartbeat-ldirectord to ldirectord. As it stands the package can't be installed by yum. 2010-08-18 Dejan Muhamedagic Low: IPaddr2: improve error handling when findif fails 2010-08-17 Dejan Muhamedagic Low: IPaddr2: exit early and with the right code if the ip parameter is not set 2010-08-12 Lars Marowsky-Bree Low: Filesystem: Clarify metadata and improve non-clone warning 2010-08-09 Lars Marowsky-Bree Low: IPv6addr: Memory alignment fix for IA64 And yes, I am not happy either; but even with memalign() memory, gcc still complains, so the void * intermediate cast is still necessary. Just the cast w/o the memalign() allocation would crash on IA64. I hope that architecture dies a quick death. --HG-- extra : rebase_source : 0fb3f39970bdc120e7b5fba1dcdb243c3415e481 2010-08-09 Sohgo Takeuchi Medium: ldirectord: port number mismatch of imaps and pops 2010-08-06 Dejan Muhamedagic Low: IPv6addr: interface index in /proc/net/if_inet6 may be longer than 2 chars (thanks to Stefan Sakalik) (lf#2462) --HG-- extra : rebase_source : 271f34c3be2c1e8bcc568220f738ca6cb7c74d3b 2010-08-05 Sohgo Takeuchi medium: ldirectord: Assign a port to fallback servers specified in a virtual server section if needed I found a bug in ldirectord and attach a patch to fix this problem to this E-mail. The bug is that if a port is omitted in a "fallback" in a virtual section, the entry is never seen in the virtual server table even if all real servers are down. virtual=10.10.100.1:daytime real=10.10.100.2:daytime gate fallback=127.0.0.1 A debug message says like this. DEBUG2: Running system(/sbin/ipvsadm -a -t 10.10.100.1:13 -r 127.0.0.1: -g -w 1) Running system(/sbin/ipvsadm -a -t 10.10.100.1:13 -r 127.0.0.1: -g -w 1) illegal real server address[:port] specified DEBUG2: system(/sbin/ipvsadm -a -t 10.10.100.1:13 -r 127.0.0.1: -g -w 1) failed: system(/sbin/ipvsadm -a -t 10.10.100.1:13 -r 127.0.0.1: -g -w 1) failed: The problem in the source code is that when a port of "fallback" is omitted, the port is derived from a port specified in a "virtual" service (from a behavior of the parse_fallback function), but the port is used before it is defined. I tested ldirectord on Ubuntu 10.04 with perl 5.10.1. 2010-08-04 Dejan Muhamedagic Low: ocf-tester: show output from the agent in case of error 2010-08-03 Dejan Muhamedagic Medium: IPaddr2: unique_clone_address should work without CIP (lf#2442) IPaddr2 shouldn't add iptables rules when unique_clone_address is set. See the referenced bugzilla for more details. 2010-08-02 Jan Sembera Low: ocf-shellfuncs: handle properly syslog facility set to none 2010-07-26 Keisuke MORI IPv6addr: remove libnet dependency 2010-07-29 Sohgo Takeuchi Medium: ldirectord: allow underscore in service name If a underscore is contained in a service-name of hostname:servicename (ex. real=realserver.example.com:pipe_server), it leads to a configuration error. 2010-07-26 Tim Serong Medium: CTDB: Deprecate (and make optional) smb_private_dir param (bnc#623788) 2010-07-26 Simon Horman Merged with http://hg.linux-ha.org/agents 2010-07-26 Tim Pretlove Medium: ldirectord: Oracle compatibility Allow checking of Oracle databases to work correctly. Tested-by: Geoff Harrison 2010-07-20 Hirotaka Igarashi resource-agents: RA for psql does not work correctly with netmask Resolves: rhbz#614457 2010-07-20 Lon Hohberger resource-agents: Allow other values for "yes" Resolves: rhbz#614421 2010-07-20 Marek 'marx' Grac resource-agents: change build system to include tomcat6 RA Resolves: rhbz#593721 resource-agents: new agent for tomcat 6 Resolves: rhbz#591003 resource-agents: Use SIGQUIT if SIGTERM was not fast enough There are applications (like psql) that are not closed with SIGTERM if there are open connections. New function stop_generic_sigkill() was introduced to handle such case and disable service correctly. Based on patch proposed in bz by Shane Bradley Resolves: rhbz#612165 2010-07-20 Lon Hohberger resource-agents: Add resource type to logging resource-agents: Add NFSv4 agent to installation Resolves: rhbz#595547 2010-07-20 Masahiro Matsuya resource-agents: Fix migration mapping behavior w/ virsh Consider a two node cluster. The hostname of the nodes are 'sk010001' and 'sk010002'. Each nodes has two bonded network interfaces for public and private communications. The hostname matches the hostname of the IP address on public network. Node1: sk010001 bond0 (for public network) : 172.22.51.1 sk010001 bond2 (for private network): 172.22.48.131 sk010001-hb Node2: sk010002 bond0 (for public network) : 172.22.51.2 sk010002 bond2 (for private network): 172.22.48.132 sk010002-hb In cluster.conf, a migration mapping is used to specify that the private interfaces should be used for migration traffic. Unfortunately, when doing a live migration, while the traffic should use the -hb interfaces, bond0 is used. This is because the vm.sh agent uses the following command for live migration from sk010001 to sk010002: virsh migrate --live su21k003 \ qemu+ssh://sk010002-hb/system This is not enough to ensure the guest goes over the private interface. The --migrateuri option of 'virsh migrate' is needed for it. The following command should be executed instead: virsh migrate --live su21k003 \ qemu+ssh://sk010002-hb/system tcp:sk010002-hb Resolves: rhbz#596918 2010-07-20 Lon Hohberger resource-agents: Add NFSv4 support This patch allows the nfsserver agent to export NFSv4 and map recovery directories to places on shared file systems in order to allow v4 recovery to take place when the NFS daemons are started. Resolves: rhbz#595547 2010-07-16 Lars Ellenberg Low: BuildRequires help2man; fix %files Low: agent meta data must be valid xml, escape as needed 2010-07-15 Brett Delle Grazie Low: tomcat: run the stop command in background Low: tomcat: prevent spurious 'n not found' log messages Low: tomcat: update the meta-data 2010-07-15 Simon Horman low: remove stray .cvsignore file 2010-07-09 Marek Marczykowski Medium: mysql: use replication credentials to test the replication status 2010-07-09 Dejan Muhamedagic Medium: IPsrcaddr: add the cidr_netmask parameter 2010-07-07 Dejan Muhamedagic Low: mysql: replace the deprecated -O option (bnc#620275) --HG-- extra : rebase_source : 2a439e389b30abddc9d76f99b2359194972185b5 2010-07-06 Lars Marowsky-Bree Medium: Raid1: Improve re-adding device logic (bnc#619121) 2010-07-06 Lon Hohberger resource-agents: fix incorrect link resolution in fs-lib Fs-lib.sh was not resolving symbolic links prior to checking [ -b ], meaning that we would end up with a potential mismatch if what is in /proc/mounts did not match what was in cluster.conf, causing a service failure Resolves: rhnz#609579 resource-agents: Make vm.sh use stop/start timeouts Startup timeouts are handled by calling the status_program (if specified) every few seconds until either the timeout is reached or the status program returns a successful result. Because the combined VM boot + application time was the basis for the 5 minute check interval for depth=10 (where the status_program is called), it has been reduced to 1 minute intervals. The startup timeout, since it may now be waiting for services within the VM, has been increased to 5 minutes from 20 seconds to match the previous status check timing functionality. This patch fixes the previous patch's inadequate handling of start timeouts. Resolves: bz606754 Revert "resource-agents: Make vm.sh use stop/start timeouts" This reverts commit c9bbf4f12402c5e38e51e21d38682712c18ab5ee. resource-agents: Make vm.sh use stop/start timeouts Startup timeouts are handled by calling the status_program (if specified) every few seconds until either the timeout is reached or the status program returns a successful result. Because the combined VM boot + application time was the basis for the 5 minute check interval for depth=10 (where the status_program is called), it has been reduced to 1 minute intervals. The startup timeout, since it may now be waiting for services within the VM, has been increased to 5 minutes from 20 seconds to match the previous status check timing functionality. Resolves: bz606754 resource-agents: Add missing resource docs Resolves: rhbz#606470 Revert "config: Add missing resource docs to cluster.rng" This reverts commit 64c6eca0cecb66050ad614236535ee9ca1fa7eff. config: Add missing resource docs to cluster.rng Resolves: rhbz#604298 resource-agents: Resolve incorrect default The incorrect default was causing VMs to restart every 5 minutes. Resolves: rhbz#599643 resource-agents: Add status_program attribute Resolves: bz583789 resource-agents: Clean up file system agents - Make mountpoints with spaces work universally - Use fuser -kvm in all agents to make force_unmount work - Eliminate duplicate code Resolves: rhbz#581533 rhbz#582753 rhbz#582754 rgmanager: Minor cleanups for file system agents - don't double-check if the mount point exists; once is enough - report an error if creation of the mount point fails Resolves: rhbz#581533 rgmanager: Allow spaces in fs.sh mount points Resolves: rhbz#582573 rgmanager: Kill processes correctly w/ force_unmount The killMountProcesses function was written about 10 years ago. It was designed to work with lsof or fuser, and to log messages for each process killed. This is not a bad idea. The problem is that parsing the output of either is and error-prone, particularly when mountpoints are similar to other directories on the system. A far less error-prone method to cleaning up a mount point is to use 'fuser -kvm' on it. Not only is this less error-prone, it's a good bit faster at doing its job than iterating through output in a shell script. This patch makes force_unmount very reliable at killing the correct processes, but we lose the logging functionality. It is a fair trade-off because there have been several bugs in the killMountProcesses function over the years which have caused several problems. Resolves: bz555901 bz582754 Revert "resource-agents: Kill correct PIDs during force_unmount" This reverts commit c5f279928da14f72fac458bd2439b43ea95e66df. The bug did not apply to STABLE3. resource-agents: Clear vm.sh default The default behavior of vm.sh is to use_virsh whenever possible. The problem with specifying a default in the RA metadata is that it causes rgmanager to provide a value to the RA when it is called. In this case, rgmanager was providing "1" as the default, meaning that the logic which carefully determines whether to use 'xm' or 'virsh' is disabled. This is not seen when testing the resource agent by hand; it is only seen when called in the context of rgmanager or rg_test. 2010-07-06 Shane Bradley resource-agents: Kill correct PIDs during force_unmount When stopping a service that contains a filesystem resource that is managed by fs.sh, on a stop operation it can kill a process that is not located on that mount point. There was a couple of scenarios that would kill a process that was not on that mount point when it was stopped: These processes should not have been killed: $ less /tmp/media/demo1/tmp.txt $ less /tmp/test\ /media/demo1/tmp.txt These processes was and should have been killed: $ less /media/demo1/tmp.txt Resolves: rhbz#555901 2010-07-06 Lon Hohberger resource-agents: SAPDatabase: remove $TEMPFILE 2010-07-05 Fabio M. Di Nitto resource agents: Remove bashisms from resource scripts Patch from Guido Gunter Closes Debian Bug: #581137 2010-07-05 Lars Marowsky-Bree Low: Raid1: Support attempting to re-add mirrors on deep monitor action This requires a very recent mdadm to support the "--re-add missing" option. It is also recommended to set the "policy action=re-add" in mdadm.conf, so that mirrors are automatically re-added on assemble. 2010-07-05 Simon Horman low: unset CFLAGS in a dash-friendly way According to the dash man page, the only way to unexport a variable is to unset it. This also works in bash. For the record, the reason that CFLAGS is exported from the environment to configure and configure appends -Werror to the CFLAGS then it will be re-exported by configure and has been observed to cause trouble when configure runs a separate configure is the libltdl directory. See: Debian Bug #582874 - http://bugs.debian.org/582874 Debian Bug #582875 - http://bugs.debian.org/582875 2010-07-01 Lars Marowsky-Bree Merge local changes with agents. 2010-07-01 Lars Ellenberg Med: VirtualDomain: Fix spurious stop failures Don't fail on stop just because one virsh domstate got "no state". Don't fail a "forced stop" of an already stopped resource. Don't timeout in stop before escalating to "forced stop". Combination of the first two bugs frequently led to "failed stop", where the stop in fact succeeded just fine. --HG-- extra : rebase_source : 5ba5044d6938b1ff70c7e2fea0a7723ca91c2e54 2010-07-01 Raoul Bhatia [IPAX] Low: Filesystem: include ext4 and ext4dev in the supported filesystems list (prior to Linux 2.6.28) 2010-07-01 Marek Marczykowski Low: mysql: a note on --skip-slave-start in the meta-data Low: mysql: more detailed logging on errors Medium: mysql: exit with proper code on monitor success when master Low: mysql: fix possible syntax error when slave lagging behind master for more than max_slave_lag Low: mysql: remove trailing spaces from variables 2010-07-01 Lars Marowsky-Bree High: Raid1: Fix graceful stop code path. 2010-06-30 Lars Marowsky-Bree Low: Raid1: Syntax clean-ups Low: Raid1: Remove horribly outdated comments and broken validate-all function Low: Raid1: mdadm --readonly is asynchronous, moving it to a better place Trying to mark the array readonly before stopping it actually makes the stop fail (since the readonly action somehow hasn't completed yet). Thus, it's safer to directly try to stop it and to attempt the readonly mode only if the stop failed. Low: Raid1: umounting the filesystem is outside Raid1's job description. Low: Raid1: prefer mdadm to legacy raidtools due to better reporting Med: Raid1: start needs to properly handle bad array states (bnc#618775) Low: Raid1: Whether or not the md is already mounted is irrelevant for start High: Raid1: Handle stop for failed arrays properly (bnc#618775) 2010-06-25 Serge Dubrouski Low: pgsql: remove a duplicate of start_opt from the meta-data 2010-06-25 Dejan Muhamedagic Low: pgsql: update the validation handling Low: pgsql: a few fixes for the socketdir handling 2010-06-25 Serge Dubrouski Low: pgsql: get the default value of the socketdir from the config file 2010-06-23 Serge Dubrouski Medium: pgsql: socketdir parameter to manage non-default UNIX socket directories 2010-06-17 Dejan Muhamedagic Medium: db2: fix the probe operation High: db2: support for v9.x instances (bnc#608952) 2010-06-14 NAKAHIRA Kazutomo Medium: sfex: wait in the start and stop actions until sfex_daemon starts/exits On start, exit sooner if sfex_daemon started, otherwise wait until it starts. On stop, send the TERM signal then wait until 5 seconds remained before the timeout, then try with KILL. --HG-- extra : rebase_source : d6ca8f482a4968ae3f494729e08aac3759f94118 2010-06-14 Simon Horman Medium: ldirectord: don't exit on timeout in HTTP/HTTPS check Tested-by: Rob de Wit 2010-06-08 Florian Haas Medium: mysql: only check slave status if OCF_CHECK_LEVEL>0 check_slave requires test_user to be set. Since the previous assumption was that test_user needed to be set only if $OCF_CHECK_LEVEL was greater than zero, follow that precedent for check_slave too. 2010-06-04 Florian Haas High: mysql: check for write permissions after creating pid and socket directory Stupidly, we'd previously check permissions on the pid directory before creating it. Move the permission check after directory creation, and also check the socket directory in the same manner. --HG-- extra : rebase_source : f4cd3b9b6b0d32e35aa10deb6b7ebc23bb91b191 2010-06-07 Haw Loeung Low: ldirectord: Don't strip pre-pended "/" from query strings when using simpletcp ldirectord is used to monitor SMTP connections and issues an SMTP "QUIT". It then checks if it receives a "221 ..." SMTP message to say it was successful. Unfortunately, ldirectord sends "/QUIT" instead of "QUIT" and this fails. 2010-06-04 Jan Kasprzak Medium: IPv6addr: allow link-local addresses in case the interface name is provided 2010-06-04 Takatoshi MATSUO Medium: Filesystem: new run_fsck parameter 2010-06-01 Florian Haas Low: drop unnecessary locale boilerplate Locale related environment variables are set from .ocf-shellfuncs, no need to duplicate. Low: iSCSITarget: IET: don't touch initiators.{allow,deny} if they do not exist If the user does not configure any target security, initiators.{allow,deny} may not even be created. Thus, when we stop the target, we should check whether they exist so we do not attempt so sed a non-existent file. Thanks to Matthew Richardson for reporting this. High: iSCSITarget: fix race for target IDs when using IET (LF2432) When multiple iSCSITargets using IET are started in parallel, they may race for target IDs. Use ocf_take_lock so competing instances wait for each other when selecting a target ID. Thanks to Matthew Richardson for reporting and testing. 2010-05-28 Lars Marowsky-Bree ocf-tester: meta-data also should never be affected by missing binaries. Medium: ocf-tester: Extend to cover initial probe (monitor_0) test. --HG-- extra : rebase_source : 407b91f9621482f83d5c744237bdc9adb88c72a6 2010-05-28 Florian Haas Low: fio: add man page 2010-05-28 Lars Marowsky-Bree Low: fio: shield check_binary with ocf_is_probe Low: fio: Move check_binary to not interfere with meta-data. 2010-05-27 Lars Marowsky-Bree Build: add fio to makefile. Low: fio: new resource agent for IO load simulation. 2010-05-27 Florian Haas Low: mysql: use ocf_is_true to check for enable_creation enable_creation is a boolean parameter, which might be passed in as 0|1, yes|no, true|false, on|off. The RA would do an equality check for 1. Use ocf_is_true instead. 2010-05-26 Thomas Bätzler Medium: exportfs: match client specifications properly for multiple exports of the same directory 2010-05-25 Thomas Bätzler Medium: exportfs: change the wrong reference to RMTAB to ETAB in monitor 2010-05-25 Andrew Beekhof Low: doc: Create a man page for ocf-tester 2010-05-25 Lars Marowsky-Bree Low: ocf-shellfuncs: correctly identify root by id only (bnc#602312) Some setups have multiple accounts with uid=0 but not all named "root". 2010-05-19 Florian Haas Low: iSCSILogicalUnit: add missing exit --HG-- extra : rebase_source : 849356ae4aa76c75d53ce40ce67e32a5fefb35fb Low: iSCSITarget, iSCSILogicalUnit: add status operation Add status as a simple alias for monitor. --HG-- extra : rebase_source : 4d5a666f00e1898166f453610e98cdff85f2a743 Low: iSCSITarget: fix indentation --HG-- extra : rebase_source : dd75510135e2ad56cc193e43e18786d8ba8b79f5 Low: iSCSITarget, iSCSILogicalUnit: drop do_cmd, replace with ocf_run --HG-- extra : rebase_source : fec03ea4ceaf8bdbc9838c4ddbcc771362456c60 Low: iSCSITarget, iSCSILogicalUnit: fix probe behavior Check for ocf_is_probe during validate, so only limited validation is performed during probes. --HG-- extra : rebase_source : 52ccbfa6e9f667e832e13dc426ff4c2c0692fada Low: iSCSITarget, iSCSILogicalUnit: exit, don't return, on error --HG-- extra : rebase_source : 8b0b6b8147a9b8607416acafe798bd21a5c5b442 2010-05-16 Florian Haas Low: anything: fix arithmetic expression (Debian #581073) Fix possible bashism. Low: SAPDatabase, SAPInstance: fix redirection (Debian #581073) Fix possible bashism. Low: xinetd: fix checkbashisms FP (Debian #581073) "type" at the start of a line causes a checkbashisms false-positive. Rewrap the line to keep it happy. Low: eDir88: force bash (Debian #581073) RA manages an application that is only supported on platforms where bash is the default shell. No reason to fix bashisms, simply force bash. Low: AudibleAlarm: replace "echo -ne" with printf (Debian #581073) Fix possible bashism. Low: replace "kill -SIG" with "kill -s SIG" (Debian #581073) Remove possible bashism. 2010-05-17 Serge Dubrouski Low: pgsql: add check for fuser Thanks to Jose Castro Luis for reporting the issue. 2010-05-16 Florian Haas Low: CTDB: fix iteration (Debian #581073) Fix possible bashism. Low: CTDB: fix test expression (Debian #581073) Fix possible bashism ([ a == b]). 2010-05-17 Dejan Muhamedagic Low: oracle, oralsnr: remove bashisms 2010-05-16 Florian Haas Low: mysql: advertise monitor actions for Master and Slave role Thanks to John Ratz for reporting this on the mailing list. doc: show only one action in man page synopsis If a resource agents defines several action instances for specific roles, it's confusing to list the actions multiple times in the synopsis. Select only one. 2010-05-14 Jonathan Brassow halvm: Fix bug 506587: lvm agent incorrectly reports vg is in volume_list If the name of the VG controlled by rgmanager is a substring of the name of the node, the HA-LVM script will complain of an improper setup. The fix is to also look for the associated quotes that are necessary when specifying strings in lvm.conf. Thanks to John Ruemker for the patch. 2010-05-12 Florian Haas Medium: VirtualDomain: back out FQDN resolution support After some discussion on the mailing list, FQDN resolution of migration targets via host(1) turned out to not be such a stellar idea. Back out changesets a7c0f35916bf, 6e9f621b685a. --HG-- extra : rebase_source : cae4d8a9eb54a1ce12daf8d9ae16e1f3e54a2c2a 2010-05-12 Dejan Muhamedagic Low: ldirectord: use $1 instead of \1 in pattern replace (bnc#605086) 2010-05-06 Florian Haas Low: VirtualDomain: check for host(1) during validate Add a conditional check for host(1) during validate, if "migrate_use_fqdn" is set to true. Patch based on work by Dag Stenstad. --HG-- extra : rebase_source : 3df5cd7cc8dbae17e94a957b4bedcbd11b53fd5d Low: VirtualDomain: improve VirtualDomain_Define logging Patch based on work by Dag Stenstad. --HG-- extra : rebase_source : 9bd0158effbb94a8e022824921577c5b257aba8f 2010-05-05 Dag Stenstad Low: VirtualDomain: add support for resolving FQDN of destination node during migration Add "migrate_use_fqdn" boolean parameter which, if set, causes the target node's name to be resolved to its FQDN during migrate_to. Helpful if migration uses TLS, certificates have been issued that contain the nodes' FQDNs, and everything is running on a platform where "uname -n" does not render the FQDN. --HG-- extra : rebase_source : 1aba1d3c8005ed17b6ecc66bb1acd259799e3dc0 2010-05-06 Keisuke MORI High: set the HA_RSCTMP directory to /var/run/resource-agents (lf#2378) The HA_RSCTMP directory should be cleaned up on reboots. This is automatically done for directories under /var/run. Heartbeat used to cleanup this directory on every start, but that is wrong in case resources are in the unmanaged mode. Corosync/Openais never cleaned up this directory. Several resource agents used to create their own subdirectories. That's not an option anymore because only first level subdirectories of /var/run are cleaned. IPv6addr is modified to use the new temporary directory. There is now agent_config.h which may be included by other packages. The temporary directory is created with permissions 1755. In case a resource agents wants to manage a file there but runs as non-root, the permissions will have to change to 1777. 2010-05-03 Florian Haas Low: pgsql: update copyright notice Low: pgsql: improve monitor Use local variable, make generated psql option string a bit more legible. Medium: pgsql: implement "config" parameter Add ability to specify PostgreSQL configuration file. Necessary on platforms where the config file is not necessarily found inside the "pgdata" directory. Config file is passed in via the "-c config_file=" option. Low: pgsql: properly advertise start_opt High: pgsql: properly implement pghost parameter Can't ever have worked in practice; pgsql_monitor would attempt to connect to a host that pgsql_start never configured PostgreSQL to listen on. Low: pgsql: simplify output logging Now that we use ocf_run, we no longer need to capture output manually. Low: pgsql: use ocf_run Medium: pgsql: remove useless check for pg_ctl Already checked with check_binary during pgsql_validate_all, which is invoked before pgsql_start. Low: pgsql: use check_binary Low: pgsql: don't unset LC_ALL and LANGUAGE Not necessary, already taken care of in .ocf-shellfuncs. Low: pgsql: fix defaults and metadata Add proper defaults and variable intialization. Declare integer RA parameters as integers, not strings. Remove redundant "default is..." statements from metadata. Remove redundant information about RA parameters from script header. Low: pgsql: fix possible bashism Use kill -s , not kill - Low: pgsql: fix typos, grammar errors, and whitespace 2010-05-03 Tim Serong Low: SAPDatabase RA: Correctly determine BOOTSTRAP if DIR_BOOTSTRAP OCF param not specified High: SAPInstance RA: don't rely on op target rc when monitoring clones (lf#2371) 2010-04-28 Jonathan Brassow HA LVM: Use CLVM with local machine kernel targets (bz 585217) When a logical volume is activated in a cluster exclusively, the kernel targets used are single machine targets. This means we can use CLVM to protect the LVM metadata and still better align ourselves with active/passive application stacks. Making this change also simplifies HA LVM setup. There is no more setting up tags, volume_list entries, or updating the initrd. 2010-04-21 Keisuke MORI Medium: IPaddr/IPaddr2: add a description of the assumption in meta-data Medium: IPaddr: return the correct code if interface delete failed 2010-04-26 Dejan Muhamedagic Low: exportfs: quote parameter references in validate Low: exportfs: use variables for common paths Low: exportfs: no need for mktemp in exportfs_monitor 2010-04-26 Florian Haas Low: iscsi: use check_binary open_iscsi_setup: use standard check_binary test rather than explicit which magic when checking for iscsiadm 2010-04-26 Dejan Muhamedagic Low: ocf-binaries: reduce have_binary and use local var; improve error message 2010-04-25 Florian Haas Low: iscsi: update outdated metadata for discovery_type Since the RA was originally written, open-iscsi has learned iSNS, SLP, and fw discovery. Update outdated parameter longdesc saying only sendtargets was supported. Low: iscsi: define and use defaults properly, fix parameter descriptions Dev: iscsi: use here document markers that emacs can grok Some emacs configurations tend to not like here document markers other than EOF. 2010-04-23 Florian Haas Low: update copyright notices --HG-- extra : rebase_source : be98ce916de338c56d467e83803c00f6c9b05b2b 2010-04-23 Dejan Muhamedagic Low: nfsserver: improve notify options processing Medium: nfsserver: rpc.statd as the notify cmd does not work with -v (thanks to Carl Lewis) Low: nfsserver: fix the default string for the notification parameter 2010-04-23 Florian Haas Medium: exportfs: use ocf_run where appropriate Use convenience functions where available; no need to reinvent the wheel. 2010-04-22 Florian Haas Medium: exportfs: support status Support the "status" command, as advertised by usage message. Medium: exportfs: fix grep invocation and regex Allow showmount -e to report whitespace between export path and client spec. Use grep -E instead of grep -P. Low: exportfs: set executable bit Medium: exportfs: improve validate Check for required parameters. Check whether the exported directory exists (non-probes only). Medium: RA: iSCSITarget: follow changed IET access policy IET used to allow target access by default, now it denies. --HG-- extra : rebase_source : cd340e5d43b3b1e038877bfa140b96bceea78505 2010-04-21 Dejan Muhamedagic Medium: oracle: reduce output from sqlplus to the last line for queries (bnc#567815) 2010-04-19 Florian Haas High: RA: mysql: add replication monitoring, clone and m/s capability Based on lots of work by Marian Marinov . Extends the mysql RA to - monitor MySQL replication and evict replication slaves that have fallen too far behind the master, or encountered a replication error. - run in clone mode, presumably with all hosts running as MySQL slaves. - run in stateful (master/slave) mode, with the cluster manager managing most aspects of MySQL replication autonomously. To not interfere with releases, development work on this functionality was done in two separate branches ("mysql-clone" and "mysql-ms-2"), which are now no longer needed and will be closed. 2010-04-09 Florian Haas Low: RA: mysql: increase promoted master's preference --HG-- branch : mysql-ms-2 2010-04-08 Florian Haas Merge with mysql-clone --HG-- branch : mysql-ms-2 Medium: RA: mysql: be aware of lagging slaves Introduce a parameter to drop slaves from the cluster if their lag behind the master exceeds a threshold, and another to set that threshold. Also, use that maximum lag as the maximum master preference available in the cluster, and work out each instance's master preference in master/slave setups based on that maximum preference, minus the number of seconds each slave is behind the master. --HG-- branch : mysql-ms-2 Merge with upstream --HG-- branch : mysql-clone 2010-04-07 Florian Haas Low: RA: mysql: set low default master preference Absent any explicit location constraint applying to the Master role, the promotion score is -1 for any stateful resource on any cluster node, effectively preventing promotion of any slave to the master role. Follow this pattern: - On successful start, set the master preference to 10 - After successful promotion, set the preference to 100 - On demotion, return it to 10 - Prior to stopping, clear the master preference --HG-- branch : mysql-ms-2 2010-04-06 Florian Haas Low: RA: mysql: update metadata --HG-- branch : mysql-ms-2 Low: RA: mysql: improve unset_master Record SHOW PROCESSLIST output to a temporary file. Exit with OCF_ERR_GENERIC if STOP SLAVE or CHANGE MASTER TO fails. --HG-- branch : mysql-ms-2 Low: mysql: remove get_master_status function No longer needed, now that all slaves always connect to a freshly reset master. --HG-- branch : mysql-ms-2 Low: RA: mysql: improve set_master, pre and post notify Always use RESET MASTER when promoting a new master. Greatly facilitates reconfiguring slaves to use the new master, as we don't have to worry about binlogs and binlog positions anymore. Also, change set_master so it no longer issues START SLAVE. While we want to reconfigure slaves to use the new master during the pre-promote notification, we only want them to start replicating after they received a post-promote notify. See also: http://dev.mysql.com/doc/refman/5.0/en/replication-solutions-switch.html --HG-- branch : mysql-ms-2 Low: RA: mysql: improve unset_master Instead of issuing just STOP SLAVE, give the replication slave a chance to complete pending statements: - First, stop just the slave I/O thread. - Wait for the relay log to be processed. - Then, stop the slave SQL thread too. - Unset the master host. --HG-- branch : mysql-ms-2 Medium: RA: VirtualDomain: bail out early if config file can't be read during probe (Novell 593988) If the user placed the libvirt domain config file on shared storage, then there is no way for us to deduce the correct domain name. Thus we can't check whether it's running, and the only thing we can assume is that it's not. Which is ugly, but probably better than throwing an error. https://bugzilla.novell.com/show_bug.cgi?id=593988 Low: RA: mysql: fix incorrect logic in is_slave --HG-- branch : mysql-ms-2 Low: RA: mysql: really unset the replication master in unset_master --HG-- branch : mysql-ms-2 Merge branch mysql-clone --HG-- branch : mysql-ms-2 Low: RA: mysql: Make mysql_monitor a bit more legible --HG-- branch : mysql-clone extra : transplant_source : C%5B3%3E%E9%FF%C6%E0%EF%B2%E9%B4%84A%B8%E9%C5%93%08%24 Merge branch mysql-clone --HG-- branch : mysql-ms-2 Low: RA: mysql: change is_slave/check_slave logic Make check_slave fail noisily if it is checking a non-slave MySQL instance, and invoke it only after testing with is_slave first. --HG-- branch : mysql-clone Low: RA: mysql: add is_slave convenience function This function only checks whether the host is currently configured as a MySQL replication slave, regardless of whether replication is actually functional or not. --HG-- branch : mysql-clone extra : transplant_source : %3D%A7/%12%D0%A5%5D%CEc%19%CFr%00%E0%88%D7%5E%2A%90%F1 2010-04-02 Florian Haas Medium: RA: mysql: fix missing function argument --HG-- branch : mysql-ms-2 Low: RA: mysql: ignore notifications on the master host Ignore promote/demote notification that the master host receives about its own promotion/demotion. --HG-- branch : mysql-ms-2 Medium: RA: mysql: fix typo --HG-- branch : mysql-ms-2 Medium: RA: mysql: add notify operation Patch based on earlier work from Marian Marinov. --HG-- branch : mysql-ms-2 Medium: RA: mysql: add promote and demote actions Patch based on lots of work from Marian Marinov. Add promote and demote, which are essentially no-ops except for setting and unsetting read-only mode. The actual reconfiguration of slaves to connect to new masters is done during notify. --HG-- branch : mysql-ms-2 Medium: RA: mysql: add convenience function and variables for master/slave mode Patch based on lots of earlier work from Marian Marinov. Add functions to fetch replication information from a MySQL replication master, and to instruct a slave to connect to a new master. --HG-- branch : mysql-ms-2 Create branch mysql-ms-2 --HG-- branch : mysql-ms-2 Medium: RA: mysql: improve monitor for replication slaves Improve check_slave for deep monitoring of MySQL servers configured as replication slaves. Check various replication slave status parameters and observe the following policy: * If replication has run into an error (that is, a replicated statement failed to execute on the local instance), bail out with $OCF_ERR_INSTALLED, to make sure that the resource is not restarted in place. Rationale: the MySQL slave is definitely out of sync with the master, and rectifying the issue requires manual intervention. * If replication IO threads are stopped, only log a warning. This may be a relatively benign situation -- for example, the replication master may just be restarting. * If replication SQL threads are stopped, exit with $OCF_ERR_GENERIC, so the cluster manager attempts to recover the resource in place. --HG-- branch : mysql-clone Low: RA: mysql: remove test for clone-node-max For non-anonymous clones, the pengine enforces clone-node-max=1 anyway, so so need to duplicate that test in the RA. --HG-- branch : mysql-clone Low: RA: mysql: create per-instance temp files --HG-- branch : mysql-clone Medium: RA: mysql: fix check_slave "| while" creates an implicit subshell, won't set any variables in the main shell. Use sed on the temp file instead. --HG-- branch : mysql-clone Medium: RA: mysql: fix mktemp invocation --HG-- branch : mysql-clone High: RA: mysql: fix syntax error --HG-- branch : mysql-clone Medium: RA: mysql: check slave status using test_user and test_passwd --HG-- branch : mysql-clone Low: RA: mysql: add check_slave function, include in monitor Add a check_slave function that tests whether the instance is configured as a slave, and if yes checks SHOW SLAVE STATUS to see if anything is wrong with the local replication status. Purpose: a slave that has run into a replication error must be removed from the cluster, otherwise it would serve stale data out to clients. Patch based on Marian Marinov's work. --HG-- branch : mysql-clone 2010-04-02 Dejan Muhamedagic Add tag agents-1.0.3-rc1 for changeset 548028f131c9 Dev: build: set version to 1.0.3, release to 0rc1 Medium: Filesystem: prefer /proc/mounts to /etc/mtab for non-bind mounts (lf#2388) 2010-03-31 John Shi Low: Tools: ocft: fix agent installation --HG-- extra : rebase_source : d7a5bd2a9b81d07c6e444b078d617b84059e5eb4 2010-03-31 Florian Haas Medium: RA: VirtualDomain: spin on define until we definitely have a domain name Under some circumstances, we would record an empty domain name to the state file, prompting us to bail out later on. Now instead, keep trying until we get an non-empty reply for the domain name. Medium: RA: VirtualDomain: improve error messages Be a little more verbose in telling people what went wrong in case we were unable to create, or read from, the state file. --HG-- extra : rebase_source : a042a42354ef7ee38dcc4dd46e6fae811cf15859 Low: RA: VirtualDomain: make VirtualDomain_Define easier to debug Split the define function across several lines, so the RA is easier to debug with set -x. --HG-- extra : rebase_source : 3f9ffd3968eb545450dd09a10438edf7f90b005d 2010-03-31 renayama19661014@ybb.ne.jp Low: Filesystem: improve logging 2010-03-29 Dejan Muhamedagic Medium: build: add the postfix RA 2010-03-29 Iida Yuusuke Medium: RA: VirtualDomain: fix incorrect use of __OCF_ACTION 2010-03-26 Dejan Muhamedagic Medium: IPaddr2: don't bring the interface down on stop (thanks to Lars Ellenberg) 2010-03-25 Florian Haas Low: RA: exportfs: avoid double validate Low: RA: exportfs: move non-functions after functions Follow precedent sent by other RAs: move all the exportfs-*() functions above the non-function code. Low: RA: exportfs: cosmetic metadata fixes Cut lines down to about 80 columns. Remove empty default declarations. 2010-03-25 Ben Timby Medium: RA: exportfs: check more thoroughly if export succeeded Low: RA: exportfs: allow clone mode There is no real reason to disallow configuring this RA as a clone (provided the exported filesystem is a cluster FS, but that's not the RA's job to decide). 2010-03-25 Florian Haas Medium: RA: exportfs: declare fsid integer 2010-03-25 Ben Timby Medium: RA: exportfs: add "fsid" parameter Instead of generating the fsid, require an fsid RA parameter and tack that onto the export options. Medium: RA: exportfs: rename "dir" to "directory" Rename the "dir" RA parameter to "directory", to follow precedent set by Filesystem. 2010-03-25 John Shi Low: Tools: ocft: Fix configuration bugs and keep state between tests 2010-03-24 Dejan Muhamedagic Medium: oracle/oralsnr: improve exit codes if the environment isn't valid 2010-03-24 Florian Haas Low: RA: exportfs: add proper shortdesc Medium: RA: exportfs: drop prefix from RA parameters By convention, RA parameters don't contain any prefix. Replace OCF_RESKEY_exportfs_ with just OCF_RESKEY_. Low: RA: exportfs: drop $OCF_DEBUG_LIBRARY, add $OCF_FUNCTIONS_DIR Add standard initialization for RAs using $OCF_FUNCTIONS_DIR. Drop reference to $OCF_DEBUG_LIBRARY. Medium: RA: exportfs: add to Makefiles 2010-03-24 Ben Timby High: RA: exportfs: manage NFS exports Originally posted published at http://ben.timby.com/pub/exportfs.txt and referenced in: http://lists.linux-ha.org/pipermail/linux-ha/2010-March/039965.html 2010-04-15 Dejan Muhamedagic Low: nfsserver: don't use absolute path for mktemp 2010-04-14 Dejan Muhamedagic Add tag agents-1.0.3 for changeset 5ae70412eec8 Low: remove irrelevant changes from the changelog 2010-04-13 Dejan Muhamedagic Low: add the release message to the changelog build: set version to 1.0.3, release to 1 update the ChangeLog for release 1.0.3 2010-03-23 Dejan Muhamedagic Low: Route: don't assume that OCF_RESKEY_CRM_meta_clone_node_max is set to a number (lf#2375) 2010-03-19 Dejan Muhamedagic Medium: meta-data: improve timeouts in most resource agents 2010-03-19 NAKAHIRA Kazutomo Dev: SFEX daemon: fix logging 2010-03-18 Dejan Muhamedagic Low: ldirectord: fix the configfile default (bnc#589457) Low: drbd: fix metadata (bnc#588684) Low: Xen: fix documentation --HG-- extra : rebase_source : 62f36fcd2e874ea94d40664523b476925268f8fa 2010-03-18 Andrew Beekhof Low: Build: Allow building of versions other than the checked out one --HG-- extra : rebase_source : 49c13d5fb69da3fa913914f113302479dc663aa0 2010-03-15 Florian Haas High: RA: mysql: fix breakage introduced by ocft import Changeset 7b58b4400fec erroneously imported changes to the mysql RA that are not ready for production use and are carried in a separate branch for a reason. Revert that changeset as far as the mysql RA is concerned. 2010-03-09 John Shi Low: Tools: ocft: fix remote shell 2010-03-09 nakahira@intellilink.co.jp Medium: sfex: don't use pid file (lf#2363,bnc#585416) 2010-03-09 Dejan Muhamedagic Dev: anything: add back the exec bit 2010-03-08 Dejan Muhamedagic Medium: IPsrcaddr: modify the interface route (lf#2367) Low: anything (the RA): remove pid file in the stop action (lf#2365) 2010-03-04 Dejan Muhamedagic Medium: sfex: exit with success on stop if sfex has never been started (bnc#585416) 2010-03-04 John Shi High: Tools: ocft: new RA test suite 2010-03-02 Florian Haas Low: RA: vmware: test for probes Use vmware_validate_probe only if we are indeed invoked by a probe (as per ocf_is_probe). Low: RA: vmware: define OCF_RESKEY_vimshbin_default Define OCF_RESKEY_vimshbin_default, and intialize OCF_RESKEY_vimshbin to it if not set via a resource parameter. Low: RA: vmware: fix trailing whitespace 2010-03-02 Cristian Mammoli High: RA: vmware: update to version 0.2 * correctly set a default for the "vimshbin" RA param * rename set_environment() to vmware_set_env() * improve validate so it exits immediately on error, rather than returning * be more persistent on stop * improve metadata * various small fixes 2010-03-02 Florian Haas Medium: RA: ManageRAID: require bash This RA uses arrays, so it was never Bourne shell clean. It thus could only ever have worked with /bin/sh linked to /bin/bash, so let's require that explicitly. High: RA: mysql: Backed out changeset b515256cbd2f Done in a hurry, touched the wrong RA. Sorry. Medium: RA: ManageRAID: require bash This RA uses arrays, so it was never Bourne shell clean. It thus could only ever have worked with /bin/sh linked to /bin/bash, so let's require that explicitly. --HG-- extra : rebase_source : d1f281e53c1705a86e20b11ef0f9c455e9234e03 2010-03-01 Florian Haas Low: doc: add supported actions to RA man pages Add a "supported actions" section to the RA manages, detailing which actions the RA supports, and suggested timeout and interval issues for each. 2010-02-26 Marian Marinov Low: RA: mysql: Move all common variables out of the BSD check --HG-- branch : mysql-clone extra : transplant_source : %C8%05r%B3%C0%B4%26%F2_%E2%BB%BC%0E%95%C0d%8CW%8F%E8 2010-04-02 Florian Haas Low: RA: mysql: split long lines in monitor --HG-- branch : mysql-clone Low: RA: mysql: add check for clone-node-max > 1 Running 2 MySQL instances from 1 datadir is not supported. --HG-- branch : mysql-clone Low: RA: mysql: fix whitespace As far as I can tell, most of the RA uses an Emacs-style whitespace convention. Remove a few vi-isms. --HG-- branch : mysql-clone Medium: RA: mysql: make client binary path configurable Add client_binary resource agent parameter. Fix validate to use check_binary. --HG-- branch : mysql-clone Create branch "mysql-clone" While the mysql-ms branch will still need some work to become useful, create a separate branch for being able to just clone MySQL slave instances. --HG-- branch : mysql-clone 2010-02-26 Florian Haas Medium: doc: correct crm shell examples for RAs without any required parameters Do not output an empty "params \" line for RAs which come with no required parameters. Thanks to Lars Marowsky-Bree for reporting this. 2010-02-25 Florian Haas Low: .ocf-shellfuncs: add a test for whether the resource is running as a clone Add ocf_is_clone() to .ocf-shellfuncs, which returns true only if the resource - is configured as a clone, not a primitive, and - its clone-max meta attribute is set to >0. 2010-02-25 Marian Marinov Low: .ocf-shellfuncs: add a test for whether the resource is running in multistate mode Add ocf_is_ms() to .ocf-shellfuncs, which returns true only if the resource - is configured as a multistate resource, not a primitive, and - its master-max meta attribute is set to >0. 2010-02-24 Simon Horman Medium: ldirectord: Allow multiple email addresses (LF 2168) By stripping off the enclosing double quotes emailalert may have multiple addresses. ldirectord: make default configfile path of OCF agent the same as ldirectord make default configfile path of OCF agent the same as the core ldirectord code. Cc: Dejan Muhamedagic Cc: Andreas Kurz Reported-by: Thomas Baumann 2010-02-23 Dejan Muhamedagic Low: remove remaining start-delay attributes from meta-data Medium: ocf-shellfuncs: don't log but print to stderr if connected to a terminal --HG-- extra : rebase_source : e098e2c45c34d40eda3c7db90da0f2a6eadd2d00 2010-02-17 Marek 'marx' Grac resource agents: Handle multiline pid files if the pid file contains more than 1 line (like sendmail) the status_check_pid function returns an error. Patch by Kaloyan Kovachev 2010-02-10 Lon Hohberger resource-agents: isAlive error logging for file systems This change adapts two different patch sets, one contributed by Nick Downs. It fixes: - isAlive logging for all file systems - file naming during isAlive checks for cluster file systems Resolves: rhbz#562237 2010-02-09 Florian Haas Medium: doc: fix incorrect detection of required parameters Fix an incorrect XPath which would cause required parameters to be listed as "optional" in the man page. --HG-- extra : rebase_source : 1f6270afe4d11043d52cc3667a4ac023bf343e63 2010-02-22 NAKAHIRA Kazutomo Medium: ocf-shellfuncs: don't output to stderr if using syslog 2010-02-15 Florian Haas Medium: RA: make sure that OCF_RESKEY_CRM_meta_interval is always defined (LF 2284) In ocf-shellfuncs, ocf_is_probe() relies on OCF_RESKEY_CRM_meta_interval being defined. If it is not available in the environment, initialize it to 0. High: RA: vmware: fix set_environment() invocation (LF 2342) Move set_environment before vmware_validate. Without this the RA never has a chance to validate successfully. This unbreaks a change introduced in changeset 787e163c86ed. High: build: HA_NOARCHBIN must not include PACKAGE_NAME HA_NOARCHBIN must not depend on PACKAGE_NAME, but instead must be hardwired to include "heartbeat". Otherwise, haresources configurations fail as Heartbeat can't find ResourceManager on startup (as it looks in the wrong directory). Thanks to Matthias Albert for spotting this. 2010-02-15 Dejan Muhamedagic Low: oralsnr: improve logging (thanks to NAKAHIRA Kazutomo) Low: oracle: improve logging (thanks to NAKAHIRA Kazutomo) --HG-- extra : rebase_source : bbc5e4e720653b405780046d4d90aee796b1701a 2010-02-15 Simon Horman ldirectord: use $v->{frequency} if available This fixes what appears to be a typo. 2010-02-10 Dejan Muhamedagic Low: apache: return the right exit code from monitor (bnc#578628) 2010-02-08 Florian Haas Medium: RA: iSCSILogicalUnit: fix monitor for STGT Recent versions of tgtadm changed the output for displaying the backing store. Previous: Backing store: Current: Backing store path: Fix the regex for parsing tgtadm output. Thanks to Andreas Kurz for pointing this out. 2010-02-05 Simon Horman Don't install non-scripts as scripts .ocf-shellfuncs, .ocf-binaries, .ocf-directories, ocf-shellfuncs and ocf-returncodes are not scripts, rather thay are "libraries" of shell functions. And ra-api-1.dtd is a DTD. As such they shouldn't be installed as executable. With this in mind it seems that the DATA target is more appropriate than the SCRIPTS target. This problem was flagged by lintian, for more information see: http://lintian.debian.org/tags/executable-not-elf-or-script.html detect pod2man during configure And only build pod2man documentatino if it is available 2010-02-04 jesusch@jesusch.de Medium: Route: add route table parameter (lf#2335) 2010-02-01 Dejan Muhamedagic Add tag agents-1.0.2 for changeset 61203c6ef710 build: add ChangeLog to the package documentation build: set Release to 1 Dev: add ChangeLog since release 2.1.4 2010-01-30 Simon Horman ldirectord: Avoid excessively long unbreakable lines in documentation ldirectord: back 4 is not a pod directive, use back instead 2010-01-28 Dejan Muhamedagic Dev: shellfuncs: revert cs 1489123f2305, HA_NOARCHBIN is used elsewhere 2010-01-28 andreas.kurz@linbit.com Medium: ldirectord: fix setting defaults for configfile and ldirectord (lf#2328) 2010-01-27 andreas.kurz@linbit.com Low: ldirectord: handle meta-data before ldir_init (lf#2327) 2010-01-26 Florian Haas Low: doc: shorten configuration examples in RA man pages Suppress all optional parameters from the crm shell configuration snippets in the RA man pages. In addition, suppress all "op" lines except the one corresponding to the monitor action. 2010-01-26 Dejan Muhamedagic Medium: nfsserver: use default values (lf#2321) Low: fix some wrong required attribute settings Low: remove start-delay from all resource agents 2010-01-26 Jonathan E. Brassow rgmanager: halvm: Check ownership before stripping tags Resolves: rhbz#557167 2010-01-25 Florian Haas Low: doc: add RA deprecation warnings to shortdesc (LF 2244) Add "(deprecated)" to the shortdesc of RAs marked as deprecated. 2010-01-24 Florian Haas Low: doc: fix Xinetd RA shortdesc The Xinetd RA manages an Xinetd service, not the superserver itself. Fix shortdesc accordingly. 2010-01-23 Florian Haas Low: RA: mysql: correctly advertise "additional_parameters" parameter (LF 2320) Was advertised as integer, really expected a string. Change parameter type to string. Also, fix a small typo in the parameter description. 2010-01-22 Florian Haas Low: doc: make RA man pages tell whether parameters are required or optional (LF 2319) 2010-01-22 Dejan Muhamedagic Dev: portblock: fix warning in tickle_tcp High: portblock: fast reconnect/tickle ACK (patch by Jiaju Zhang) 2010-01-22 Florian Haas Low: RA: ManageVE: force bash (LF 2318) This RA uses "[[ ]]", which is a bash construct. It should declare /bin/bash as its interpreter. Low: doc: add a "see also" link to resource agent man pages Resource documentation that goes beyond the man page (Howto documents, etc.) are best kept in the Linux-HA wiki. Add a "see also" link to each man page that points to the wiki page. --HG-- extra : rebase_source : b1b2d44bb6aa2f56032cef210b7613410a43493b Low: doc: if an RA advertises migrate_*, add allow-migrate to example If a resource agent advertises support for migration, then to actually utilize that migration capability users will have to set the "allow-migrate" meta attribute to true. Add a line to that effect to the example CRM shell snippet. --HG-- extra : rebase_source : 851149eb19a61484397d1fa46c57bf2fbd005c15 2010-01-21 Dejan Muhamedagic Low: ClusterMon: don't fail in stop if the process is missing (bnc#569957) 2010-01-16 Tim Serong Dev: CTDB: auto-generate cluster-specific part of smb.conf (LF 2308) 2010-01-14 NAKAHIRA Kazutomo Low: pgsql: replace echo command with ocf_log 2010-01-12 Fabio M. Di Nitto misc: update copyright year across the board 2010-01-04 renayama19661014@ybb.ne.jp Low: Xen: check for more required programs Low: Xen: improve logging 2010-01-04 Jiaju Zhang Dev: portblock: fix wrong return status of unblock in monitor 2009-12-29 Dejan Muhamedagic Dev: Xen: fix indentation from the previous changeset 2009-12-28 Dejan Muhamedagic Dev: build: add %dir /usr/share/resource-agents to .spec Low: build: add Conflicts in .spec Low: build: fix Obsoletes 2009-12-28 renayama19661014@ybb.ne.jp Low: Xen: improve logging 2009-12-23 Dejan Muhamedagic Dev: build: set release to 1.0.2-rc2a 2009-12-22 Lon Hohberger resource-agents: Add missing btrfs & ext4 support 2009-12-22 Dejan Muhamedagic Dev: shellfuncs: drop HA_NOARCHBIN, it's not used Dev: shellfuncs: fix HA_NOARCHBIN path 2009-12-21 Lon Hohberger rgmanager: Fix erroneous bind mount warning in fs.sh The previous patch 26e9e538b22554d21ae43c4b379b664daf6f05d3 produced an incorrect error when mounting the file system. Resolves: bz526286 2009-12-18 Dejan Muhamedagic Add tag agents-1.0.2-rc2 for changeset d1f026d175cf Dev: build: output documentation directory from configure (LF 2276) Dev: build: fix dtd location (LF 2280) 2009-12-16 Florian Haas Low: doc: improve ra2refentry.xsl stylesheet Suppress superfluous newlines in crm shell configuration example. Add quotes around default values. Medium: mark obsolete RAs as deprecated (LF 2244) Mark the following RAs as deprecated, add ocf_deprecated warning, and ignore_deprecation parameter: drbd Evmsd EvmsSCC LinuxSCSI pingd Low: .ocf-shellfuncs: add ocf_deprecated convenience function (LF 2244) Add an ocf_deprecated function, to log a deprecation warning on every RA invocation (unless deliberately suppressed). ocf_deprecated checks for the truth value of a configurable resource parameter (default ignore_deprecation), and if that value is set to false (or unset), logs a warning with ocf_log warn. Deprecated resource agents should simply invoke ocf_deprecated on every action except usage and meta-data, and should also define a boolean parameter named ignore_deprecation, defaulting to false. A resource agent making use of this feature should also update its longdesc and shortdesc to explain _why_ it is deprecated, and suggest an alternative. 2009-12-15 Shane Bradley rgmanager: Fix ipv6 handling Resolves: rhbz#533461 2009-12-15 Florian Haas High: RA: VirtualDomain: fix forceful stop (LF 2283) Adopt the following strategy in VirtualDomain_Status during the "stop" operation: * If the domain is definitely running, return $OCF_SUCCESS. * If the domain is definitely not running, return $OCF_NOT_RUNNING. * Under all other circumstances (virsh domstate returning "no state", the empty string, or throwing an error), return $OCF_ERR_GENERIC. Retain the old behavior when VirtualDomain_Status is invoked from any other operation: * If the domain is definitely running, return $OCF_SUCCESS. * If the domain is definitely not running, return $OCF_NOT_RUNNING. * If the domain has "no state", or virsh domstate returns the empty string, keep trying until we time out. * Under all other circumstances, return $OCF_ERR_GENERIC. Then inside VirtualDomain_Stop, behave as follows: * If OCF_RESKEY_force_stop is true, skip immediately to forced shutdown (destroy). * Otherwise, try a graceful shutdown first, waiting for it to complete for timeout minus five seconds, looping on VirtualDomain_Status. - If at any time VirtualDomain_Status returns something other than $OCF_SUCCESS or $OCF_NOT_RUNNING; bail out and skip to forced shutdown. - If VirtualDomain_Status returns $OCF_NOT_RUNNING (indicating graceful shutdown having succeeded), return $OCF_SUCCESS immediately. - While VirtualDomain_Status returns $OCF_SUCCESS, sleep and re-check. The downside of this behavior is that if at any time during stop libvirtd (or the virtualization layer beneath) misbehaves, we don't get graceful shutdown. But there seems to be no way to fix that, and at the same time making sure stop always succeeds. 2009-12-15 Lon Hohberger Fix bind mount handling in fs.sh Don't log warnings for every bind mount found for a mount point. Instead, log one warning if and only if the file system is not mounted in the correct location. Resolves: bz526286 2009-12-15 Dejan Muhamedagic Medium: IPaddr2: CLUSTERIP/iptables rule not always inserted on failed monitor (LF 2281) 2009-12-12 Dejan Muhamedagic Add tag agents-1.0.2-rc1 for changeset 1ab6e669bd52 2009-12-11 Dejan Muhamedagic Dev: build: set version to 1.0.2 2009-12-11 Florian Haas Low: build: fix docdir for autotools versions that do not support it natively * * * spec fix --HG-- extra : rebase_source : b258792f18a02f7c76a19c3d7b0e615541dfc83b 2009-12-11 NAKAHIRA Kazutomo Dev: jboss: refine argument processing (LF 2241) 2009-12-11 Dejan Muhamedagic Low: mysql: fix wrong parameter name in comment (thanks to bryan.gay+linux_bugzilla@bryangay.com) (LF 2264) 2009-12-10 Florian Haas Low: build/doc: add BuildRequires for XML toolchain to RPM spec Autotools tarballs have the man pages in $(EXTRA_DIST), so RPMs built from those tarballs should not need xsltproc or the DocBook DTDs. To enable building directly from a hg archive, however, include libxslt, the DocBook DTD package, and the DocBook stylesheet package in the BuildRequires list. --HG-- extra : rebase_source : 518e2895f1f4daa2d8a19cdd3802827cd4bccae5 Low: doc: highlight defaults in ra2refentry.xsl --HG-- extra : rebase_source : db3cc57c1ba7be2de3c121381fa112f7f6a4f261 2009-12-10 Dejan Muhamedagic Dev: build: drop dash from release (rpm won't eat that) Dev: build: set version to 1.0.2, release to 0-rc1 Dev: build: set version to 1.0.1rc1 2009-12-10 Achim Stumpf High: proftpd: new resource agent 2009-12-10 Dejan Muhamedagic Dev: build: enable fatal warnings and add --docdir to configure Dev: build: include README.webapps properly, dtd target directory, etc - ra-api-1.dtd goes to /usr/share/heartbeat - README.webapps included as doc - add COPYING and AUTHORS diff -r e13565f0ea8a doc/Makefile.am --- a/doc/Makefile.am Wed Dec 09 16:33:15 2009 +0100 +++ b/doc/Makefile.am Thu Dec 10 03:21:59 2009 +0100 @@ -36,10 +36,10 @@ CLEANFILES = $(man_MANS) $(xmlfiles) metadata-*.xml # TODO: add README and license files -doc_DATA = README.webapps +doc_DATA = man_MANS = -EXTRA_DIST = $(man_MANS) $(doc_DATA) +EXTRA_DIST = $(man_MANS) $(doc_DATA) README.webapps # OCF_ROOT=. is necessary due to a sanity check in .ocf-shellfuncs # (which tests whether $OCF_ROOT points to a directory diff -r e13565f0ea8a heartbeat/Makefile.am --- a/heartbeat/Makefile.am Wed Dec 09 16:33:15 2009 +0100 +++ b/heartbeat/Makefile.am Thu Dec 10 03:21:59 2009 +0100 @@ -24,7 +24,7 @@ ocfdir = $(OCF_RA_DIR)/heartbeat -dtddir = $(datadir)/$(PACKAGE) +dtddir = $(datadir)/heartbeat dtd_SCRIPTS = ra-api-1.dtd if USE_IPV6ADDR diff -r e13565f0ea8a resource-agents.spec --- a/resource-agents.spec Wed Dec 09 16:33:15 2009 +0100 +++ b/resource-agents.spec Thu Dec 10 03:21:59 2009 +0100 @@ -156,10 +156,11 @@ %{_sbindir}/ocf-tester %{_sbindir}/sfex_init -%dir %{_datadir}/resource-agents -%doc %{_datadir}/resource-agents/ra-api-1.dtd -%doc %{_docdir}/resource-agents/README.webapps +%doc AUTHORS +%doc COPYING +%doc %{_datadir}/heartbeat/ra-api-1.dtd %doc %{_mandir}/man7/*.7* +%doc doc/README.webapps # For compatability with pre-existing agents %dir %{_libdir}/heartbeat 2009-12-09 Florian Haas Low: doc: convert man pages to DocBook 4.4 so they build nicely on SLES 10 (LF 2258) Low: doc: add CTDB to list of autogenerated man pages (really, this time) (LF 2256) Previous changeset a919473fb232 just contained a Makefile reorganization to make it easier to follow. _This_ changeset really adds ocf_heartbeat_CTDB.7 to the list of autogenerated man pages. Low: doc: add CTDB to list of autogenerated man pages (LF 2256) 2009-12-09 Tim Serong Dev: CTDB: Remove dependency on /etc/sysconfig/ctdb (LF 2255) 2009-12-08 Dejan Muhamedagic Dev: ocf-shellfuncs: reduce ocf_is_probe and protect variables - it's enough to do test ..., the return code will be properly set - if the RA is invoked outside of cluster (ocf-tester, etc), some variables may not be set 2009-12-08 Tim Serong Dev: CTDB: Parametize binary paths (LF 2248) Dev: CTDB: only source /etc/ctdb/functions if it actually exists (LF 2247) 2009-12-08 Florian Haas Low: CTDB: fix trivial bashisms Replace two instances of "[ $foo == $bar ]" with "[ $foo -eq $bar ]". Medium: CTDB: introduce OCF_FUNCTIONS_DIR, allow it to be overridden (LF2239) Fixes bug LF 2239 for the newly added CTDB RA. 2009-12-01 Federico Simoncelli vm.sh: Fix return codes 2009-11-30 Lon Hohberger vm.sh: Fix migration failure handling If a VM fails to migrate, there is a good chance that the VM is still running locally. Return a non-fatal error so that the resource does not enter the failed state. 2009-11-30 Florian Haas Medium: RA: Route: improve validate (LF 2232) No longer exit with $OCF_ERR_INSTALLED during probes, if configured source or gateway IP addresses are not present. Fixes http://developerbugs.linux-foundation.org/show_bug.cgi?id=2232 2009-12-07 Dejan Muhamedagic Low: build: add man pages to the spec file 2009-12-07 Tim Serong High: CTDB: new resource agent (fate#302227) 2009-12-07 Dejan Muhamedagic Low: build: build xml files only if BUILD_DOC Low: build: add README.webapps to Makefile and spec High: apache: monitor operation of depth 10 for web applications (LF 2234) Usage is explained in the README file. Important changes supporting this feature: - support for curl(1) - the new parameters: OCF_RESKEY_client OCF_RESKEY_testurl OCF_RESKEY_testregex10 OCF_RESKEY_testconffile OCF_RESKEY_testname - add "--no-proxy --bind-address=127.0.0.1" to wget options Also some code cleanup. 2009-12-07 Florian Haas Low: doc: add a sensible id attribute in ra2refentry.xsl --HG-- extra : rebase_source : 5ae12eb1c644c1041f7f3b2df25409acd248fa69 Low: doc: do case-insensitive sort over RA man pages In the DocBook appendix autogenerated from the RA man pages, sort RAs alphabetically by name, ignoring case. --HG-- extra : rebase_source : 3969688ab30f90121094a4cd0ca379c6c34bf0ac 2009-12-07 Hideo Yamauchi Medium: mysql: escalate stop to KILL if regular shutdown doesn't work 2009-12-07 Florian Haas Low: build: issue a configure warning if xsltproc is not installed 2009-12-07 Andrew Beekhof Dont build man pages at all if xsltproc isnt found Only build IPv6addr manpage if IPv6addr is also being built 2009-12-06 Florian Haas High: doc: add man pages for all RAs (LF2237) This adds a facility to create man pages from the resource agent metadata. The man pages list the RA description, supported parameters (with descriptions and defaults), and supported actions (with defaults). They also provide example configurations for the CRM shell. Since the man pages are generated via DocBook XML, the intermediate XML files may also be used for HTML and PDF documentation. ra2refentry.xsl script adapted from Dejan's original effort in the now-abandoned pacemaker-doc repository. This fixes LF bug 2237. 2009-12-05 Florian Haas Medium: Dev: make RAs executable (LF2239) In combination with specifying OCF_FUNCTIONS_DIR, this makes RAs executable in-place (i.e. from the source directory). This is helpful for documentation tools extracting the RA metadata during a build, like so: OCF_ROOT=/ OCF_FUNCTIONS_DIR=$PWD ./Dummy meta-data (Note: specifiying OCF_ROOT=/ is necessary due to a sanity check in .ocf-shellfuncs which tests whether $OCF_ROOT points to a directory). This is part 2 of the fix for LF bug 2239. 2009-12-04 Florian Haas High: RA: introduce OCF_FUNCTIONS_DIR, allow it to be overridden (LF2239) This introduces an OCF_FUNCTIONS_DIR variable, pointing to the directory where the scripts find .ocf-shellfuncs and friends. It defaults to $OCF_ROOT/resource.d/heartbeat, as always. This allows us to run RAs in-place (i.e. directly from the directory in the source tarballs) without installing them in to the provider directory. This is part 1 of the fix for LF bug 2239. 2009-12-06 Florian Haas Low: RA: jboss: fix meta-data command (LF 2241) The meta-data action must not return $OCF_ERR_INSTALLED. This fixes LF bug 2441. Low: doc: fix resource agent names in metadata (LF 2240) 4 resource agents reported incorrect names via their metadata (i.e. the RA name found in the metadata did not match the installed file name). This fixes LF bug 2240. --HG-- extra : rebase_source : e67047cc38160dc0de4f2d5ad39a12f9bf714e2d 2009-12-04 Florian Haas Medium: .ocf-shellfuncs: correct ocf_is_probe function Accidentally pushed the wrong version on my first try. Apologies. 2009-11-30 Florian Haas Medium: .ocf-shellfuncs: add ocf_is_probe function --HG-- extra : rebase_source : be580449a1f1dc539956bafa39bea81a44d565cb 2009-12-04 Dejan Muhamedagic Low: apache: match lowercase Include statements too 2009-12-02 Florian Haas Medium: build: add perl-MailTools runtime dependency to ldirectord package (LF 1469) ldirectord requires Mail::Send. Add Requires tag for perl-MailTools to the ldirectord RPM package. 2009-12-01 Florian Haas Low: doc: use meaningful shortdesc fields (fixes LF1475) Replace the less-than useful shortdesc fields in most RAs with something that may actually help users. --HG-- extra : rebase_source : 18bd0d05d3c9d5a9e703e3f77ac7c5828b0fa4fc 2009-12-01 Dejan Muhamedagic Low: build: enable disabling of libnet in configure 2009-11-27 Dejan Muhamedagic Medium: add mercurial repository version information to .ocf-shellfuncs 2009-11-27 Hideo Yamauchi Low: apache: don't drop output to stderr (in kill) Low: mysql: don't drop output to stderr (in kill and mysql start) 2009-11-25 Fabio M. Di Nitto build: relax autotools requirement 2009-11-23 Dejan Muhamedagic merge with upstream 2009-11-23 NAKAHIRA Kazutomo High: syslog-ng: new RA 2009-11-17 Florian Haas Medium: iSCSITarget, iSCSILogicalUnit: support LIO This adds support for the linux-iscsi.org (LIO) iSCSI target implementation to the iSCSITarget and iSCSILogicalUnit RAs. This implementation is intended to work with LIO 3.0 forward. 2009-11-16 Dominik Klein Low: anything (the RA): implement most of lmbs suggestions 2009-11-16 Dejan Muhamedagic Low: ra dtd: add the time type 2009-11-16 Keisuke MORI Medium: pgsql: remove the previous backup_label if it exists 2009-11-13 Lon Hohberger resource-agents: Decrease message level for debug info Resolves: rhbz#526647 2009-11-13 Ferenc Antal resource-agents: Make ip.sh deal with ip address collision Resolves: rhbz#526647 2009-11-13 Dejan Muhamedagic Dev: oracle/oralsnr: add dumporaenv function Medium: oracle/oralsnr: export variables properly Low: build: remove useless (and wrong) AC_DEFINE (thanks to Keisuke Mori) (LF 2034) 2009-11-11 Lon Hohberger resource-agents: Fix some path support bugs in vm.sh 2009-11-10 Raoul Bhatia [IPAX] Medium: postfix: fix double stop (thanks to Dinh N. Quoc) Low: mysql-proxy: call validate-all before probe 2009-11-10 Dejan Muhamedagic Low: undo wrongly applied patch for postfix (sorry) Low: IPaddr: remove spurious message from stop 2009-11-10 Junko IKEDA Low: IPaddr: don't remove the host route unconditionaly (on stop) Low: IPaddr: improve logging 2009-11-09 Lon Hohberger resource-agents: Add missing primary attribute to SAPDatabase Resolves: Fedora #533972 Part 2/2 2009-11-09 Shane Bradley resource-agents: Add missing primary attribute to SAPInstance Resolves: Fedora #533972 Part 1/2 2009-11-09 Lon Hohberger resource-agents: Fix samba netbios name Spaces should not be allowed in the NetBIOS name. Resolves: Fedora #533971 resource-agents: Fix vxfs support Resolves: Fedora #533970 2009-11-09 Dejan Muhamedagic Medium: nfsserver: validate should not check if nfs_shared_infodir exists (thanks to eelco@procolix.com) (LF 2219) 2009-11-09 Tim Serong Low: IPaddr2: Invalid default value for OCF_RESKEY_clusterip_hash (bnc#553753) 2009-11-05 Andrew Beekhof Hg: Automated merge 2009-11-03 Andrew Beekhof Allow some common build types to work with unconfigured cloned repos 2009-11-03 Lon Hohberger resource-agents: Add "path" support to virsh mode resource-agents: More misc. vm.sh warnings This adds: - warnings if use_virsh="1" is set while path is also set - warnings if you are not root - checks for xm and virsh binaries in $PATH Resolves: rhbz#529926 resource-agents: Report bad config from vm.sh Resolves: rhbz#529926 2009-11-03 Dejan Muhamedagic High: vmware: make meta-data work and several cleanups (LF 2212) First deal with meta-data and usage actions, so that they always work. - validate_vmware() invoked first, then depending on the return code and the requested action we exit appropriately. - All global variables settings moved to set_environment(). - VMSTATE is gone and replaced by calls to vmware_monitor. - Exit code if vmware-vimsh is not installed was OCF_ERR_ARGS. Replaced by OCF_ERR_INSTALLED. - Severity for several messages adjusted 2009-11-02 Robbert Muller Medium: nfsserver: use check_binary properly in validate (LF 2211) 2009-11-02 Dejan Muhamedagic Medium: IPv6addr: ifdef out the ip offset hack for libnet v1.1.4 (LF 2034) Low: build: recognize libnet api v1.1.4 2009-11-02 Lars Marowsky-Bree Merge local changes with upstream. 2009-11-02 Luke Bigum ldirectord: OCF agent: overhaul Modified version of the Heartbeat ldirectord to return better OCF return codes. The changes makes the RA much more robust: we don't turn OCF_ERR_CONFIGURED for bad arguments any more which will stop the source going fatal cluster wide. We also trap the ldirectord script's normal exit codes and return OCF compliant codes and make a better effort to stop ldirectord. 2009-11-01 Lars Marowsky-Bree RA: Xen: Remove instance_attribute "allow_migrate" (bnc#539968) This is a meta-attribute of the name "allow-migrate" and shouldn't be advertised as a RA instance attribute. 2009-10-30 Lars Marowsky-Bree RA: LVM: Make monitor operation quiet in logs (bnc#546353) 2009-10-29 Lon Hohberger resource-agents: Fix error messages in apache.sh 2009-10-28 Andrew Beekhof Also trap sigpipe - now fully compatible with the libnet version --HG-- extra : rebase_source : 436b3a022bd0b78b6953643452cdd6bac0c06653 2009-10-27 Andrew Beekhof Trap sigterm for compatibility with the libnet version of send_arp --HG-- extra : rebase_source : 814435a84b1a1cd4222f736f0926115fb39198e9 2009-10-26 Lon Hohberger resource-agents: Fix smb.sh return code 2009-10-26 Flavio Leitner rgmanager: Simplify bonded link checking The new bonding driver version 3.4.0 provides ethtool get_link operation so, now, usual link checking works for bonding devices. Resolves: bz518037 2009-10-26 Lon Hohberger rgmanager: fix bug in virsh_migrate 2009-10-23 Florian Haas Medium: RA: VirtualDomain: avoid needlessly invoking "virsh define" Running "virsh define" on every invocation has always been sort of ugly. We have now identified an issue where some versions of libvirt and/or xend appear to leak memory on virsh define, so as not to exacerbate that problem do it less often. 2009-10-22 Dejan Muhamedagic Low: apache: use /etc/apache2/envvars as default for envfiles 2009-10-22 Fabio M. Di Nitto oracledb ras: stop using obsoleted initlog replace with logger. Thanks to Bill Nottingham for the patch. Fixes rhbz#530197 Merge branch 'master' of ssh://git.fedorahosted.org/git/resource-agents 2009-10-21 Andrew Beekhof High: send_arp - turn on unsolicited mode for compatibilty with the libnet version's exit codes 2009-10-21 Dejan Muhamedagic Low: Build: remove -Waggregate-return IPv6addr.c uses libnet_name2addr6 which returns struct libnet_in6_addr. gcc issues warning because not all compilers support such function calls. Low: ClusterMon: fix comment 2009-10-19 Lars Marowsky-Bree Low: Raid1: Improve monitor function (bnc#546551) 2009-10-19 Dejan Muhamedagic Dev: Filesystem: don't try monitor 10 for non-block devices (thanks to Florian Haas) 2009-10-16 Philipp Wehrheim Low: MailTo: allow multiple word subject line 2009-10-15 Kazunori INOUE Medium: IPv6addr: recognize network masks properly 2009-10-14 Marc Milgram rgmanager: Fix path evaluation during force unmount Resolves: rhbz514040 2009-10-14 Dejan Muhamedagic Dev: Filesystem: modify deep monitor interface The meta-data describes the operation. Depth 10 monitor reads the raw device on which filesystem resides. Reward to a solution that exercises the filesystem circumventing the cache (perl? python?). 2009-10-14 Florian Haas Dev: RA: Filesystem: define status file in a central location Rather than setting statusfile in multiple places, define it once upon initialization. Low: RA: Filesystem: pass the "conv" and "bs" flags to dd Some filesystems may disallow reading from or writing to a filesystem in less than 512-byte increments in O_DIRECT mode. Make sure we always operate on 512 bytes at a time, and pad with NULL bytes, when we operate on the device in the monitor operation. 2009-10-12 Dejan Muhamedagic Medium (LF 1331): VIPArip: new parameters for binaries' paths Medium: IPaddr2: check binaries when it makes sense Medium (LF 2147): IPaddr2: behave if the interface is down It is not exactly clear why should findif exit with error in case if cannot find a route which may match the interface. At any rate, IPaddr behaves different in such situations, i.e. it relies on ifconfig. IPaddr2 now matches that behaviour. BTW, it is a mess, badly needs a rewrite. 2009-10-12 Neil Katin Low (LF 1949): ldirectord: fix typos in OCF RA 2009-10-12 Dejan Muhamedagic Medium (LF 2173): nfsserver: exit properly in nfsserver_validate Medium: oracle: drop spurious output from sqlplus See discussion at http://marc.info/?l=linux-ha&m=125302127316849&w=2 2009-10-06 Andrew Beekhof Hg: Automated merge Fix use of undefined macro @HA_NOARCHDATAHBDIR@ 2009-10-05 Marek 'marx' Grac rgmanager: Rgmanager uses sudo to start/manage tomcat5 service which fails as no tty is available Similar problem was also with other resource that use sude (postgresql, oracle) Resolves: bz#524757 2009-09-23 Dejan Muhamedagic Low: mysql: fix a typo 2009-09-21 Andrew Beekhof Bug bnc#534803 - Provide a default for MAILCMD 2009-09-18 Fabio M. Di Nitto Merge branch 'master' of ssh://git.fedorahosted.org/git/resource-agents 2009-09-16 Florian Haas Medium: RA: VirtualDomain: loop on status if libvirtd is unreachable (addendum) Stupidly omitted one line in changeset 2949a259a776. Medium: RA: VirtualDomain: loop on status if libvirtd is unreachable When libvirtd is busy or an invocation of "virsh domstate" goes awry, then virsh may return an empty string. In that case, instead of immediately bailing out with $OCF_ERR_GENERIC, keep retrying until we receive a proper response, or time out. 2009-09-15 Florian Haas Merge with upstream Low: RA: Filesystem: update/remove ancient comments Update some horrendously outdated comments. Ditch some perfectly obsolete ramblings altogether. Medium: RA: Filesystem: implement monitor operation This patch implements a monitor operation for the Filesystem RA. Previously, monitor was an alias for status, which monitor continues to utilize, it only adds some functionality testing whether I/O on the mounted filesystem is in fact possible. To that end, the patch implements a new parameter "statusfile_prefix", which is unset by default and if left unset, retains the old behavior of monitor being functionally equivalent to status. When set, statusfile_prefix is interpreted as the prefix to be used for a status file for resource health monitoring. If this parameter is set, then the monitor operation will read from, and optionally write to, a file whose filename is constructed from the directory parameter, the statusfile_prefix parameter, and the host name. Its expected content is the name of the resource. If status files should live in their own directory, users should include a trailing slash ("/") in "statusfile_prefix". Note that the status file should not be interpreted as a lock file of any kind -- it just records a per-resource, per-node status. If the "monitor" operation is configured with a depth of 0, and only if statusfile_prefix is set, then the status file is just created once (on resource startup, and only if it does not already exist), and subsequently read. If "depth" is greater than 0, then the status file is periodically read and re-written. Note that if the filesystem is mounted read-only (NFS mounted from a read-only export, for example), and one configures this parameter, then one must create the status file manually or the resource will fail on start. Note also that for monitoring with a depth of greater than 0, the status file must be writable by root. This is not always the case with an NFS mount, as NFS exports usually have the "root_squash" option set. In such a setup, one must either use read-only monitoring (depth=0), export with "no_root_squash" on the NFS server, or grant world write permissions on the directory where the status file is to be placed. Another caveat is that the monitor operations requires the "oflag=direct" option to be supported by dd. Some ancient versions of coreutils come with a dd that does not support this option -- on such systems, status file based monitoring is simply not available. RA: Filesystem: allow configuring smbfs mounts as clones It is already allowed to configure Filesystem resources as clones when they represent NFS mounts. The same should be true for Samba mounts. 2009-09-15 NAKAHIRA Kazutomo High: jboss: new resource agent 2009-09-15 Dejan Muhamedagic Low: oracle: new showdbstat action to print output from sqlplus 2009-09-08 Raoul Bhatia [IPAX] Low: mysql-proxy: meta-data update Medium: mysql-proxy: log_level and keepalive parameters 2009-09-07 Raoul Bhatia [IPAX] Low: mysql-proxy: add admin-module options (username/password and lua-script) Low: mysql-proxy: move test for binary to validate-all * some minor cleanup * change exit to return and exit with $? where applicable * move mysql-proxy binary check to validate_all * call validate_all for start and stop actions 2009-09-07 Philipp Reisner Medium: RA: portblock: add per-IP filtering capability Rather than managing iptables rules matching only on a specific port, allow users to specify IP addresses to match on. Retain the previous default of matching on any IP address, by adding a default of "0.0.0.0/0" for the newly introduced "ip" RA parameter. 2009-09-03 Florian Haas Medium: RA: iSCSITarget: be more persistent deleting targets on stop Under some circumstances, an target may not be successfully removed on the first try. Observed with IET only. Low: RA: iSCSILogicalUnit: make scsi_id and scsi_sn unique Set the unique flag in the RA metadata for scsi_id and scsi_sn. Medium: RA: iSCSILogicalUnit: use a 16-byte default SCSI ID SCSI IDs are limited to 24 bytes in length, but IET is known to support only 16 bytes. Thus, truncate $OCF_RESOURCE_INSTANCE to 16 bytes for the default SCSI ID. 2009-08-28 Federico Simoncelli resource-agents: Handle virsh migration errors gracefully 2009-08-28 Lon Hohberger resource-agents: Fix missing path attribute handling If using the Xen hypervisor with vm configs in a non standard location (e.g. not /etc/xen), rgmanager was ignoring the path attribute, preventing VM management. 2009-08-28 Florian Haas Add the ha_parameter function back into .ocf-shellfuncs. The Heartbeat init script needs this function. This is an ugly cross-package dependency that should be fixed by other means, but for the time being it should at least un-break Heartbeat. 2009-08-25 Dejan Muhamedagic Low: Delay: fix wrong parameter required attributes Low: replace depth=10 with depth=0 2009-08-24 Andrew Beekhof Massage build process Add sample spec file 2009-08-20 Andrew Beekhof Import shellfuncs from heartbeat as badly written RAs use it --HG-- extra : rebase_source : 6f0406cec388d6a6fa057663255d92fc3eb78315 2009-08-18 Dejan Muhamedagic High (LF 2138): IPsrcaddr: replace 0/0 with proper ip prefix (thanks to Michael Ricordeau and Michael Schwartzkopff) 2009-08-18 Andrew Beekhof Dev: Fix the INITDIR comparision Provide a sane default for INITDIR if none of the usual suspects exist 2009-08-14 Lars Marowsky-Bree Merge local changes with upstream. Medium: shellfuncs: Make the mktemp wrappers work. 2009-08-14 Raoul Bhatia [IPAX] High: postfix: new resource agent 2009-08-14 Lars Marowsky-Bree Low: ldirectord: add dependency on $remote_fs. Low: ldirectord: add mandatory required header to init script. ldirectord: Remove superfluous configure artifact. 2009-08-13 Lars Marowsky-Bree Build: Import ldirectord. ocf-tester: Fix package reference. findif: actually include the right header. Simplify configure. findif: Include agent_config.h findif: Add some defines. High: Add findif tool (mandatory for IPaddr/IPaddr2) Low: configure: Fix package name. ocf-tester: Fix path to DTD. Remove references to heartbeat shellfuncs. 2009-08-12 Dejan Muhamedagic Low: Virtual: unify VirtualDomain_Status usage 2009-08-12 Florian Haas Medium: VirtualDomain: destroy domain shortly before timeout expiry As lmb pointed out in private email, the VirtualDomain RA is not as persistent on stop as it should be. This patch employs the following logic: - If force_stop is unset and a stop operation is issued, invoke "virsh shutdown" and loop on monitor for (x-5) seconds, where x is the operation timeout passed received from the CRM. If the domain is still alive by then, kill it with "virsh destroy". - If force_stop is set and a stop operation is issued, invoke "virsh destroy" outright. Low: Virtual: rearrange loops so that unnecessary sleeps are avoided 2009-08-12 Kazunori INOUE High: IPv6addr: new nic and cidr_netmask parameters 2009-08-12 Dejan Muhamedagic merge with upstream 2009-08-12 Andrew Beekhof Ensure HA_VARRUNDIR has a value to substitute --HG-- extra : rebase_source : 941afac8f2d19bc882dbe8bf4adfc7f4e753987b Remove references to Echo function --HG-- extra : rebase_source : a305a420b0dc9f269649d9ff45f34f9eb8063483 2009-08-11 Lon Hohberger rgmanager: Fix clusterfs.sh to use meta_refcnt correctly Red Hat Bugzilla #506094, part 1/3 (Parts 2 and 3 are in the rgmanager repository) 2009-08-11 sborion Low (LF 2159): Squid: make the regexp match more precisely output of netstat 2009-08-11 Ben Last Medium (LF 2165): IPaddr2: remove all colons from the mac address before passing it to send_arp 2009-08-05 Andrew Beekhof Added tag agents-1.0 for changeset bc085cb74dba 2009-08-04 Andrew Beekhof Include license information 2009-07-29 Shane Bradley rgmanager: Assume 'no state' is OK for VMs When vm.sh does a status check, sometimes "no state" is returned. That state is currently not a "running" state. Thus the status check fails. The "no state" should be treated as a running state per recommendation from libvirt developers. Bugzilla: 514044 2009-07-29 Andrew Beekhof Remove useless path lookups Merge changes from heartbeat: 64f4592952ea 2009-07-29 convert-repo update tags 2009-07-29 Andrew Beekhof Prefer the unversioned auto* binary Hg: Automated merge 2009-07-28 Andrew Beekhof Only install drbd once 2009-07-16 beekhof@localhost.localdomain Populate the authors file 2009-07-10 Florian Haas Low: RA: iSCSILogicalUnit: set default for SCSI SN, truncate SCSI ID default to 24 bytes SCSI IDs are limited to 24 bytes, thus if $OCF_RESOURCE_INSTANCE is longer than that, truncate it for $OCF_RESKEY_scsi_id_default. Also, set a cluster-wide unique, failover persistent default for the SCSI serial number. I choose the first 8 bytes of an MD5 hash of $OCF_RESOURCE_INSTANCE. SCSI SNs can actually be up to 16 bytes long, but 8 bytes should be sufficient to provide uniqueness. All of this, of course, is to ensure smooth failover for iSCSI initiators that read the SCSI ID and SN upon reconnect after a connection interruption. 2009-07-10 Fabio M. Di Nitto Merge branch 'master' of ssh://git.fedorahosted.org/git/resource-agents 2009-07-09 Lon Hohberger drbd: Fix metadata target 2009-07-09 Florian Haas rgmanager: Add resource agent for DRBD DRBD (www.drbd.org) is a shared-nothing synchronous storage replication capable of acting as a drop-in replacement for shared storage. This resource agent manages a DRBD device by switching it into the Primary and Secondary roles as needed. For a configuration example, please see http://www.drbd.org/users-guide/s-rhcs-failover-clusters.html Medium: RA: iSCSITarget: reintroduce "tid" resource parameter Much as I liked the idea of referring to targets just by target IQN, using "dynamic" target IDs unfortunately breaks failover for tgt, at least with some iSCSI initiators. The details are explained here: http://lists.wpkg.org/pipermail/stgt/2009-July/003067.html Consequently, I'm reintroducing the tid parameter for iSCSITarget. It is required only on tgt, and optional on IET. iSCSILogicalUnit is unchanged. It still figures out the correct tid from the configured target_iqn. Low: RA: iSCSITarget: rename CHAP authentication parameters, make username unique Since CHAP authentication, as presently implemented in the RA, only applies to "incoming user" authentication, rename parameters accordingly: * "username" -> "incoming_username" * "password" -> "incoming_password" This will allow us to add support for outgoing user authentication without breaking compatibility. Also, since CHAP authentication accounts are target specific in only some iSCSI target implementations, make usernames unique. If in an iSCSI implementation with no per-target user accounts the same username were used for multiple targets, they would all map to one account, with passwords overriding one another. 2009-07-09 Fabio M. Di Nitto drop reference to obsolted tool 2009-07-08 Lars Marowsky-Bree Build: allow docdir to be configured, and print in summary. Low: Build: Use docdir as base for stdocdir. 2009-07-07 Florian Haas Low: RA: iSCSILogicalUnit: use $OCF_RESOURCE_INSTANCE as default SCSI ID Some iSCSI initiators and other application rely on a device's SCSI ID to be persistent across target failovers. As some target implementations do not guarantee SCSI ID persistency across failovers, use $OCF_RESOURCE_INSTANCE as the default SCSI ID. Medium: RA: iSCSITarget, iSCSILogicalUnit: identify targets by IQN, not by tid While implementing LIO functionality for these RAs, I noticed a design flaw in the existing RAs: They identify a target by a numeric "target ID", which is actually IET/stgt specific and not supported in LIO (and presumably, other target implementations). So, I've rewritten these agents to identify and reference targets by iSCSI Qualified Name (IQN). As a consequence: - the iSCSITarget parameter previously named "name" (which was stupid and ambigious, anyway) is now named "iqn"; - the iSCSITarget parameter "tid" has gone away. Instead, where the implementation requires it, a "target ID" is discerned on the fly; - the iSCSILogicalUnit resource parameter "tid" has also gone away; - iSCSILogicalUnit now has a required parameter "target_iqn", which is to hold the target IQN and is, of course, used to assign the LU to an existing target. 2009-07-02 Florian Haas Low: RA: iSCSITarget, iSCSILogicalUnit: remove useless "return" after check_binary Minor cleanup for the iSCSITarget and iSCSILogicalUnit RAs: check_binary exits with $OCF_ERR_INSTALLED if the requested binary is not found. Thus the useless return statements after check_binary in validate-all can be removed. Low: RA: iSCSITarget, iSCSILogicalUnit: rename instance attributes This patch renames some resource parameters (a.k.a. supported instance attributes) as discussed on the mailing list: * "params" -> "additional_parameters" (so as not to conflict with the CRM shell's "params" keyword, and to also distinguish from other, named parameters that can be set via individual instance attributes). * "initiators" -> "allowed_initiators" (to remove ambiguity) Medium: RA: iSCSILogicalUnit: add support for SCSI ID, SCSI SN, Vendor ID, and Product ID It seems nice to be able to set these in an implementation-independent way. These VPD attributes could already be set previously by using the "params" instance attribute, but that approach required observing that, for example, the SCSI ID is named "ScsiId" in IET while it's "scsi_id" in tgt. Now, you can just set "scsi_id" and the RA will translate to the appropriate target parameter, as per the implementation. Being able to set the SCSI ID is specifically helpful as some iSCSI initiator implementations rely on consistent SCSI IDs for smooth target failover. Medium: RA: iSCSITarget: add support for CHAP authentication This patch adds support for incoming user authentication using CHAP. It retains the default behavior of allowing unauthenticated access if no username is specified. 2009-06-30 NAKAHIRA Kazutomo Low: RA: pgsql: logging improvements 2009-06-29 Florian Haas Low: RA: iSCSILogicalUnit: add support for per-LU parameters This patch adds support for parameters at the LU level. This comes in handy if one wishes to assign custom SCSI IDs, serial numbers, and the like. Low: RA: iSCSITarget: add support for restricting target access This patch adds support for restricting access to specific targets based on initiator IP address, hostname, or subnet. It retains the default behavior of allowing access from all initiators. Low: RA: iSCSITarget: improve loop over existing connections on stop 2009-06-26 Lon Hohberger rgmanager: follow-service.sl stack cleanup 2009-06-24 Fabio M. Di Nitto build: convert to autoconf/automake/libtool requires: - autoconf 2.63b - automake 1.11 - libtool 2.2.7a - pkgconfig 0.23 - m4 1.4.13 2009-06-22 Federico Simoncelli rgmanager: Allow vm.sh use of libvirt XML file This allows use of libvirt XML files to create transient virtual machines instead of statically defined virtual machines. This allows putting libvirt XML files on, for example, cluster file systems. 2009-06-22 Lon Hohberger rgmanager: Fix stack overflows on stress testing S/Lang requires you to eat all the values on the stack if not assigned, or you can discard all values, but you can't take 2/5 without explicitly discarding the other three. rgmanager: Fix restart-after-migrate issue * Fixes erroneous migration_mapping noise. Resolves Red Hat Bugzilla #505340 rgmanager: Check for all ORA- errors on start/stop Resolves: 471066 2009-06-22 Marc Grimme rgmanager: Implement explicit ordering for failover This allows users to define an explicit service processing order when central_processing is enabled. Previously, users would have to order things the way they wanted in cluster.conf. Resolves: 492828 2009-06-22 Lon Hohberger rgmanager: Fix up multiple Oracle instance handling Note that you can only have one instance start/manage Enterprise Manager. You must set all other instances to "base" type (see metadata). Resolvs: 471226 rgmanager: Remove extra checks from Oracle agents Resolves: 470917 rgmanager: Make vm.sh use libvirt This makes rgmanager use virsh/libvirt instead of xm so it can manage either KVM or Xen virtual machines. It will still use xm if a user has a non-standard config file path, however, the recommendation is that the user store all Xen configuration files in one place and use bind mounts to mount the requisite path on /etc/xen. Virsh does not have a notion of a search path like xm, therefore, further support of the path attribute in xm will be limited. During this partial overhaul, extraneous attribute parsing was removed and the 'stop' phase now correctly returns failure if it cannot contact the hypervisor. Users may force the use of 'xm' by adding: use_virsh="0" to vm resources. Furthermore, users may force use of a specific hypervisor by using: hypervisor="qemu" or: hypervisor="xen" The other important note is that libvirtd is now required to be running in order to operate with the virsh command set even if using the Xen hypervisor. Resolves: 412911 468691 rgmanager: Optimize fork/clone during status checks * New option: quick_status trades off logging and verbosity for vastly improved performance. This reduces or eliminates load spikes on machines with lots of file system resources mounted, but should only be used in such cases (because logging is disabled when using quick_status) rhbz250718 (lots of analysis here) rhbz487599 (RHEL4 bug) rhbz250718 (RHEL5 bug) rgmanager: Make 'make check' work for resource-agents rgmanager: Fix DTD so that it actually works 2009-06-22 Florian Haas Low: RA: don't show which(1) errors Low: RA: iSCSITarget/iSCSILogicalUnit: don't advertise lio support 2009-06-18 Florian Haas Medium: RA: portblock: fix invalid exit codes on monitor IptablesStatus() returns 1 ($OCF_ERR_GENERIC) whenever the iptables rule is not configured. This breaks probes whenever a resource which is expected to be stopped, is in fact stopped. Also, this patch removes a pointless function wrapper and uses $OCF_RESOURCE_INSTANCE for the ha_pseudo_resource state file name. This updated patch also removes another obsolete comment. 2009-06-17 Florian Haas High: RA: iSCSITarget, iSCSILogicalUnit: add support for tgt Dev: RA: iSCSITarget, iSCSILogicalUnit: improvements and forced disconnect Add a generic command wrapper to log command, their output, and their exit code (stolen from lmb's original drbd OCF RA). Set a system-dependent default implementation based on the availability of administration utilities. Implement forced connection shutdown on target stop (many thanks to Lars Ellenberg for the sed wizardry). 2009-06-14 Dejan Muhamedagic Low: RA: apache: make sure that proxies are not used for monitor 2009-06-14 Florian Haas High: RA: iSCSITarget/iSCSILogicalUnit: two new resource agents 2009-06-09 Andrew Beekhof Import the needed pieces of /etc/ha.d/shellfuncs Syntax highlighters look for END Fix send_arp compilation Fix sfex compilation Import sfex, send_arp and ocf-tester 2009-06-08 Dejan Muhamedagic Dev: RA: IPv6addr: revert patch 8d2fa7da1d29 since the checksum issue has already been solved (LF 2034) 2009-06-07 Dejan Muhamedagic Dev: RA: drbd: fix metadata 2009-06-06 Andrew Beekhof Build: Fix the IPv6addr agent Configure against cluster-glue instead of heartbeat-common 2009-06-05 Andrew Beekhof Build: configure and installation improvements 2009-06-05 Dejan Muhamedagic Low: RA: IPaddr2: include netmask in search for the right interface (by indego on irc) 2009-06-05 Andre, Pascal High: RA: IPv6addr: supply checksum for ICMPv6 packets 2009-06-04 Dejan Muhamedagic Medium (bnc#499291): RA: iscsi: replace wrong variable reference 2009-06-04 Lars Marowsky-Bree Merge local changes with dev. 2009-05-28 Raoul Bhatia [IPAX] Low: RA: mysql-proxy: cleanup 2009-05-28 Andrew Beekhof Hg: Automated merge 2009-05-26 Raoul Bhatia [IPAX] High: RA: mysql-proxy: new RA 2009-05-26 Dejan Muhamedagic Low: RA: set start-delay to 0 in all RA Low: RA: drbd: cleanup the meta attributes mess 2009-05-22 Marek 'marx' Grac apache.sh: #489785 - apache.sh does not handle a valid httpd.conf 2009-05-14 Florian Haas Low: RA: VirtualDomain: Improve status and migrate operations 2009-05-13 Lars Marowsky-Bree RA: SAPDatabase + SAPInstance: New versions from SAP. 2009-05-13 Dejan Muhamedagic Medium: RA: drbd: support drbd versions >=8.3 2009-05-04 Fabio M. Di Nitto build: fix doc Makefile to stub targets misc: drop obsoleted bits 2009-04-24 Dominik Klein Low: RA: mysql: Correctly remove eventually remaining socket 2009-04-24 NAKAHIRA Kazutomo High (LF 2112): RA: sfex: checkproc/killproc are not available everywhere 2009-04-23 Andrew Beekhof Hg: Automated merge 2009-04-23 convert-repo update tags 2009-04-21 Dejan Muhamedagic Low: Build: make gcc version processing more compact Low: Build: some gcc version numbers are big Low: Build: simplify printing gcc version 2009-04-10 Dejan Muhamedagic Low: RA: IPaddr2: missing $ in variable reference 2009-04-09 Dejan Muhamedagic Low: RA: Dummy/AoEtarget: remove some testing code 2009-04-08 Dejan Muhamedagic Low (LF 2108): RA: Raid1: reduce noise 2009-04-08 Florian Haas High: RA: AoEtarget: new RA to export ATA-over-Ethernet (AoE) targets 2009-04-08 Jean-Francois Larvoire High (LF 2108): RA: Raid1: don't always exit with error if the device doesn't exist 2009-04-08 Dejan Muhamedagic High (LF 2108): RA: Raid1: stop action failure if the device doesn't exist 2009-04-06 Dejan Muhamedagic Low (LF 1921): RA: IPaddr: replace tabs with spaces in the output of findif Low (LF 1915): build: bad reference to dirlist Low (LF 2042): build: exit with error if zlib is missing Low: RA: oralsnr: stop a half-running listener/improve logging 2009-03-30 Lars Marowsky-Bree RA: LVM: Fix return code in case activation failed. RA: LVM: Only deactivate locally as well. RA: LVM: Change default activation mode to "local". LVM resources are supposed to be configured as clones, so each of them should only activate the VG locally. RA: LVM: Support "exclusive" parameter to control VG activation mode. 2009-03-27 Lon Hohberger resource-agents: Status check tuning/optimization * Don't bother with status checks on 'service' abstract resource. 2009-03-24 Fabio M. Di Nitto build/init: install/create common dirs a bunch of directories need to be created either at install time or by init scripts for the stack to operate properly. Make sure that the basic is all there from upstream. 2009-03-24 Lars Marowsky-Bree RA: LVM: Return proper exit code if vgchange version could not be determined. 2009-03-23 Dejan Muhamedagic Low: build: package stonith documentation 2009-03-22 Lars Marowsky-Bree RA: Filesystem: correct return code if device/mountpoint are missing on start. 2009-03-12 Lon Hohberger rgmanager: Fix ip start phase with monitor_link="0" 2009-02-26 Lon Hohberger rgmanager: Enable exclusive prioritization use case Use case: * 4 nodes running a 'slave' service * 1 node must be made master, but first stop the instance of the 'slave' service running on that node cluster.conf example: ... Real world example: https://bugzilla.redhat.com/attachment.cgi?id=330261 rhbz482858 2009-02-26 Mark Hlawatschek resource-agents: Tweak environment for SAP resource agents rhbz479708 2009-02-26 Juanjo Villaplana resource-agents: fix netfsclient cache handling Ensures correct removal of temporary file when client caching is disabled. Resolves: rhbz486329 2009-02-25 Fabio M. Di Nitto build: stricter install invokation make install fails in shell loops to detect errors. 2009-02-24 Florian Haas Support "idle" as Xen domain status. Recent libvirt versions return "idle" on "virsh domstate", where previous versions returned "blocked". 2009-02-25 Alan Robertson Low (LF 2082): RA: oralsnr: meta-data fix 2009-02-25 Hideo Yamauchi Low: RA: tomcat: add support for the catalina log rotation. 2009-02-20 Philipp Kolmann IPv6addr fails on /64 prefixes Hi Simon, it seems I am the one unlucky guy who uses heartbeat with IPv6.... I started updateing my cluster today to lenny and IPv6addr fails again: scs1:/etc/heartbeat/resource.d# ./IPv6addr 2001:629:3800:33:0:0:0:122 start 2009/02/16_20:19:50 ERROR: Generic error ERROR: Generic error I dug into the source of IPv6addr.c and it seems that the mask is too long and therefore the scan_if isn't matching. I have a 2001:629:3800:33::/64 subnet but it seems from my debug output that IPv6addr tries to match /96 bits of the IP address which fails. My C knowledge is sadly too little to fix this myself. I would greatly be happy if you could help me with that. ------------------------------------------------------------------------------ Hi, we found some discussion about this issue here: http://www.velocityreviews.com/forums/t283343-shifting-bits-shift-32-bits-on-32-bit-int.html In post #4 it reads: The behaviour of shifts defined only if the value of the right operand is less than the number of bits in the left operand. So shifting a 32-bit value by 32 or more is undefined... further info in #7: Better yet, read the first part of section 5.8 of the ISO/IEC 14882:2003 standard: The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand. So it seems that my patch is the proper fix in the end after all. Attached as file, since BT distroyed the formatting. ------------------------------------------------------------------------------- This bug was reported as Debian bug #515662 http://bugs.debian.org/515662 2009-02-17 Xinwei Hu High: RA: scsi2reservation: fix wrong logic in check for scsi_reserve 2009-02-03 Lars Marowsky-Bree RA: anything: some comments added. 2009-02-02 Marek 'marx' Grac [RGMANAGER] Resolves #483093 - samba.sh tries to kill the wrong pid file Resource agent was using old pid file naming schema. Now it is 'smbd-$config_filename.pid' 2009-01-30 Dominik Klein High: RA: anything: new OCF RA for arbitrary daemons 2009-01-28 Dejan Muhamedagic Low (LF 1982): RA: VirtualDomain: exit with proper code if there is a problem with the configuration file 2009-01-26 Marek 'marx' Grac [RGMANAGER] Resolves #481058 - Add option startup_wait for mysql RA 2009-01-20 Marek 'marx' Grac [RGMANAGER] Resolves #449394 - Recovery policy of type restart doesn't work [RGMANAGER] Resolves: #474444 - Zero-length pid files cause resource start failures 2009-01-20 Keisuke MORI High (LF 2034): RA: IPv6addr: fix aborting on x86_64 and sending neighbor advertisements 2009-01-13 Fabio M. Di Nitto rgmanager: fix again randomization of temp files build: adapt Makefile to expand for EVENT_TARGETS 2009-01-13 Mark Hlawatschek [rgmanager] Add no_unmount to netfs.sh 2009-01-12 Fabio M. Di Nitto misc: Update copyright for 2009 also purge non-relevant part of the COPYRIGHT file. 2009-01-09 Dejan Muhamedagic Low: RA: apache: improve the meta data documentation 2009-01-07 Fabio M. Di Nitto split tree into separate projects new resource-agents master test commit 2009-01-05 Christine Caulfield cman: Make API calls work on an inquorate system cman: fix error checking in testcmanquorum1 cman: more corosync changes cman: Make new services compile with latest corosync 2009-01-05 Fabio M. Di Nitto config: time to say goodbye to ccsd remove all legacy code that won't be released from master anymore qdisk: fix mkqdisk output we killed output to std* by default in logging. we can't use log*print* anylonger for this operation. Switch back to printf. 2009-01-02 Dejan Muhamedagic High (LF 1952): RA: Filesystem: implement bind mounts 2009-01-02 Christine Caulfield cman: let 'cman-tool leave -w' wait even if shutdown has already started cman: Return an error if 'cman_tool leave' is attempted during shutdown There were some occasions when 'cman_tool leave' simply returned success even though shutdown had already started. This fixes those. cman: make 'cman_tool leave -w' wait until cman has shut down Once we get a sucessful return from cman_try_shutdown there still might be a delay before cman actually shuts down because the LEAVE messages need to be sent around the ring. cman_tool now waits until it can no-longer contact cman if -w is given on the command-line. 2009-01-01 Dejan Muhamedagic Low (LF 1946): RA: drbd: replace single with double quotes to allow expansion 2008-12-30 Dejan Muhamedagic High (LF 2025): RA: Filesystem: on stop, check if the filesystem has been umounted before shooting processes 2008-12-29 Dejan Muhamedagic Medium: RA: mysql: handle monitor and stop properly on invalid environment 2008-12-29 Raoul Bhatia [IPAX] Low: RA: mysql: su fails if the user has an invalid shell 2008-12-23 David Teigland dlm_tool: change to new debugfs scan The rsb addr was added to the output. 2008-12-23 Dejan Muhamedagic Low: RA: VirtualDomain: fix unique (attributes) in meta-data Low: RA: VirtualDomain: shorten the big sed expression 2008-12-23 Florian Haas Medium (LF 2019): RA: VirtualDomain: live vm migration 2008-12-22 Bob Peterson gfs: improve gfs_fsck rindex repair code bz 442271 - GFS: gfs_fsck bugs found in rindex repair code This patch makes improvements and fixes some bugs in gfs_fsck's rindex repair code. Basically, if RGs are damaged, especially in the third section (i.e. RGs added by gfs_grow) it did not properly locate the RG boundaries. Also, if the rindex was completely zeroed out, it did not recover it in cases where the file system had been extended. 2008-12-19 Fabio M. Di Nitto build: restore original behaviour when building groupd fix small regression introduced when merging pacemaker build patches 2008-12-19 Christine Caulfield cman: fix cman_tool join return code Tha last checkin broke the cman_tool return code. The FORKED and SUCCESS messages could (and in a normal run often would) be returned as part of the same pipe read. 2008-12-18 Christine Caulfield cman: drastically improve startup errors cman_tool join has a nasty habit of just exiting with "corosync failed to start" or some such unhelpful error message. This patch improves on these by trapping the corosync exit code and attempting to interpret it for the user. 2008-12-17 David Teigland dlm_tool: lockdebug using new debugfs file A new dlm patch adds a new debugfs file that shows more info about rsb/lvb/lkb structs. dlm_tool lockdebug will use this new file, if available, and display the new info in a format similar to the original format. 2008-12-17 Dejan Muhamedagic Low: RA: vmware: remove bashisms Low: RA: Xen: remove bashisms Low: RA: oracle/oralsnr: replace typeset with local Low: RA: SysInfo: put bash in the bang line Low (LF 1487,1903): RA: IPaddr: improve network interface lookup on *bsd systems 2008-12-17 Christine Caulfield cman: add cman3 services These services are corosync plugins and libraries that implement the 'old' cman functions in a more standard manner. The quorum plugin is a generic quorum provider that uses the CMAN (openVMS) algorithm to provide quorum and adds the hooks for a quorum device. With a compilation flag it can be made wire-protocol compatible with the 'old' cman plugin. The new cman service provides everything else that iold cman providef that's not related to quorum and is intended as a backwards-compatibility layer. The libcman part is not yet complete so I'm not going to replace the existing 'daemon' directory here just yet, also I don't think it's ready to be built as standard with the rest of the tree. This is more of a statement of intent than working code ;-) Signed-Off-By: Chrissie Caulfield 2008-12-17 Fabio M. Di Nitto ccsd: port to logthread infrastructure Several changes across the whole tree to use logthread instead of logsys. libccsconfdb: do not allow logfile_priority to override debug. debug output is generally more important than logfile_priority. logthread: filter messages to stderr to respect logging priorities dlm_controld: stop linking against logsys Remove leftover linking bit from Makefile. 2008-12-16 Dejan Muhamedagic Low: RA: SysInfo: strip leading 0 from a var expansion and save the bash parser 2008-12-16 Bob Peterson Make gfs2_freedi delete indirect blocks with height >= 2 bz 474707 - GFS2: gfs2_convert not freeing blocks when removing file with height >=2 Before this patch, the libgfs2 function gfs2_freedi would only free indirect blocks from an inode at height==1. If an inode grew to a height of 2 or more, some blocks would not be freed when they should have been. This patch allows it to free all indirect blocks at all heights. In the case of the bugzilla, the users would run gfs2_convert on a gfs file system and afterward, gfs2_fsck would find a number of blocks that should have been freed, but were not. 2008-12-15 Bob Peterson Remove splice_read file op for jdata files. bz 436811 - Some files with inherit_jdata flag will not allow reads from apache The splice_read op is supported for normal files but not journaled files. Rather than return a bad return code when invoked, this removes the splice_read op from the file ops for jdata files. That enables callers to find an alternative method of accessing the jdata file rather than failing the operation. 2008-12-15 Fabio M. Di Nitto build: fix fence_scsi installation bits build: fix typo and get rgmanager to build again 2008-12-15 Marek 'marx' Grac [FENCE] Support for 'metadata' option for fencing agents / default values Fencing library now understand new 'metadata' action which shows available commands (for use on stdin). Output should be compatibile with stonith agents. Currently it is just partial (parameters only). We will extend it according to what we really needs. It should be possible to use these metadata for building UI. Device options now contains 'short description' and earlier option 'order' was added so we can set the correct orded (it will be same as in --help). Default values are now stored in fencing library as they are used for metadata generating. Fence agent can set their default settings to all_opt. There is a bit of redundancy ('required') because we will check it later. But I was not able to describe requirements without using not-naive schema. 2008-12-14 Dominik Klein Low: RA: drbd: restore the master setting on probe 2008-12-14 Dejan Muhamedagic Low: build: really don't build quorumd High: build/packaging: ha_logd got it's own init.d/logd script 2008-12-13 Dejan Muhamedagic Low: RA: mysql: validate environment for stop and monitor actions 2008-12-13 Dominik Klein Low: RA: mysql: check if mysqld binary exists and if the pid dir is writeable 2008-12-13 Florian Haas High: RA: VirtualDomain: add monitor scripts hook 2008-12-12 Lon Hohberger rgmanager: Part 2 - flip logging function name rgmanager: Fix up logging, part 1 * Let clulog report the parent resource-agent * Use ccs_read_logging API * Don't break old configs using rm/@log_level or log_facility 2008-12-12 Bob Peterson Grab hold of journal-turned-RG buffers so they're not freed. bz 471618 This is addendum patch 3 for bug 471618. The problem is, we weren't making a claim on buffers that used to be for the journals under GFS, but now are RG space for GFS2. So the buffer may go away especially when running gfs2_convert on a file system that's full. When the buffers are released at the end, they may not really exist any more, which caused a segfault. Grabbing hold of the buffer ensures they won't go away, so freeing them will work as normal. 2008-12-12 Lon Hohberger fence_xvm: Use new logging config parameters logthread: Add missing prototype logthread: Make multiple init/exit calls work * This patch allows a user to call logt_init()/logt_exit() as much as is needed. * A new function, logt_reinit() is added for convenience. This function uses values previously supplied to the logt_init() function to reinitialize the log system. This can be done post-fork, for example. Note: you MUST call logt_init() and logt_exit() before logt_reinit() can be used. 2008-12-12 Christine Caulfield cman: fix memory leak This is a very small memory leak but it could add up in some very long-lived processes that send a lot of messages. 2008-12-12 David Teigland dlm_controld/gfs_controld: read lockless resources from ckpts When we mount and read plock state from a checkpoint, we can't ignore zero-size checkpoint sections, because we need to add the resource along with its owner from the section id. What should be skipped are zero-length section id's. dlm_controld/gfs_controld: dump unused resources Unused/cached resources have owner state that's useful to see in the debug dump. dlm_controld/gfs_controld: fix plock rate limiting When the plock rate limiting code enabled locking after disabling it, it would be immediately disabled again. So, once plocks were disabled due to the rate limit, they would never be processed again. 2008-12-11 Lon Hohberger rgmanager: Fix license for follow-service.sl 2008-12-11 Christine Caulfield cman: Make cman-preconfig reload too When reload is called we now create a new totem key for corosync. I have a patch for corosync that will allow dynamic configuration of totem parameters. cman: Make cman the quorum provider for corosync cman now plugs into the Quorum API for corosync and tells it (and any interested subsystems) whether the cluster has quorum. 2008-12-10 Bob Peterson mkfs.gfs2 hangs with many journals bz 471618 This is the addendum 2 patch for bug #471618. There are two fixes: (1) switches the order back to normal (reverse) when freeing buffers. This ensures the oldest buffers are freed before the more recent, which is what we want and how it was originally. (2) The get_first_leaf and get_next_leaf code were making the faulty assumption that the buffers they're searching will be in memory. That may not be the case, especially if the buffer order is incorrect (as in fix (1)). Use jbsize for height computations on journaled files. bz 475488 This is an addendum patch for bug #471618. Without this patch, users may encounter an infinite loop in mkfs.gfs2 if the number of journals causes the block size to be exceeded for the per_inode system directory. For a block size of 1K, specifying five journals on mkfs.gfs2 will cause this problem. For a default 4K block size, you would likely need 17 journals to get the hang. 2008-12-10 David Teigland dlm_controld/gfs_controld: plock config paths In cluster2 where plocks are done by gfs_controld, plock config options are under . In cluster3 plocks are done by dlm_controld (and gfs_controld for back-compat), and plock config options are under (or ). Make both gfs_controld and dlm_controld look for plock config options under both and . 2008-12-10 Dejan Muhamedagic Medium: RA: IPaddr2: consolidate init/validate (thanks to Tim Verhoeven) 2008-12-10 Lon Hohberger rgmanager: Include follow-service.sl in the install 2008-12-10 Mark Grimme rgmanager: Add follows-service script 2008-12-10 Mark Hlawatschek rgmanager: Update SAPInstance / SAPDatabase to current versions 2008-12-09 Lon Hohberger rgmanager: make dtest compile 2008-12-09 Jan Friesse fence: VMware VI better handling of "strange" names Patch solving situations when user enter machine name with \t in name. Output of helper is now in DSV format. fence: VMware VI helper path fix Fix localtion of VMWare VI helper in main agent. 2008-12-09 Fabio M. Di Nitto build: install fence_vmware_vi_helper in sbindir build: don't set exec bit on built files. this should be done by install. build: install fence_vmware_vi bits in the appropriate locations no changes to the fencebuild.mk script were required afterall. 2008-12-09 Jan Friesse fence: Added fence agent based on VMware VI API Why another VMware agent? Because VI API is only one really cluster aware (this means, it is only one ready to distinguish between datacenters, folders, ...). 2008-12-09 Fabio M. Di Nitto build: fix fence agents man page Makefile remove duplicate entry and sort alphabetically. 2008-12-08 Steven Whitehouse cman: loading lock_dlm module should be optional in initscript Loading the lock_dlm module should be optional since it is optional in the kernel configuration, and in the future we intend to merge it into gfs2 itself. 2008-12-08 Fabio M. Di Nitto build: fix dlm_controld makefile clean a bit the ifdef enable_pacemaker and fix an object entry. dlm_controld: include saAis from openais the duplicate version in corosync has been correctly revomed from the tree. Use the right one. libccs: build with latest corosync replace saAis.h with corotype.h and SA_AIS_OK with CS_OK. 2008-12-05 David Teigland gfs_controld: use new uevent strings Use the new uevent strings when available, otherwise fall back to using sysfs files. (Fixes and some minor munging to Steve's initial patch.) 2008-12-04 David Teigland dlm_controld/gfs_controld: plock dump display resource owner When plock ownership is enabled, it's important to see which node is the resource owner. gfs_controld/dlm_controld: fix lock syncing in ownership mode bz 474163 Locks that are synced due to a resource being "un-owned" were having their read/write mode reversed on the nodes being synced to. This causes the plock state on the nodes to become out of sync, and operate wrongly. 2008-12-03 Nicolas MONNET rgmanager: Make postgres-8.sh use su instead of sudo Resolves part of #462910 2008-12-03 Jan Friesse fence: Set binary on telnet connections Resolves: #469066 2008-12-03 Dejan Muhamedagic Low: RA: MailTo: check if MAILCMD is set early 2008-12-02 Dejan Muhamedagic High: build: drop erroneously inserted VirtualDomain 2008-12-02 Andrew Beekhof dlm_controld: pacemaker build stuff to build pacemaker version of dlm_controld 2008-12-01 Bob Peterson Fix many bugs with gfs2_convert. bz 471618 - GFS2: gfs2_convert is broken Functional changes to the code are as follows: 1. Original problem is fixed The original problem reported in the bug was caused because there is a fundamental shift of indirect block pointers between GFS1 and GFS2. For indirect blocks, GFS1 added 64 bytes of reserved space after the gfs meta header. GFS2 did not. That meant GFS1 could hold fewer pointers than GFS2 on the same block. The meant code had to be added to gfs2_convert to shuffle all the pointers around to their corresponding gfs2 locations. This was not a simple calculation because of sparse files. For example, with a 1K block size, if you do: dd if=/mnt/gfs/big of=/tmp/tocompare skip=496572346368 bs=1024 count=1 the resulting metadata paths will look vastly different for the single block of data: height 0 1 2 3 4 5 GFS1: 0x16 0x46 0x70 0x11 0x5e 0x4a GFS2: 0x10 0x21 0x78 0x05 0x14 0x78 It is relatively easy to calculate the new metapath, but you can't just shuffle a pointer from its old location to its new location because the destination slot might be used for a different source slot. In the example above, we couldn't just move the pointer at height 3 from offset 0x11 to another block at offset 0x05 because there might already be a pointer at 0x05 being used for something else. To complicate matters, we could not just assign new blocks and copy the data because we should be able to run on a "full" file system. So I had to write a complex new function called adjust_indirect_blocks along with several support functions. Someone else might be able to write a function to get the job done simpler, but at least this one works for both sparse and fully-packed metadata trees of several heights. It basically formulates in memory a metadata tree for all the data blocks, clears out all the metadata buffers, and lays them all back down again according to the new layout. 2. Improved error checking I added some badly needed error checking. That's because I was running into problems and wanted to eliminate the possibility that errors were occurring, but not being reported. 3. Improved progress reporting With a large number of resource groups, I saw long periods of time where the program would just sit for five minutes. I was afraid that the tool had hung, so I kept breaking in with gdb to check on the tool's progress. But if I'm that concerned about what it is doing when it's quiet, so too will be the customers who are anxiously trying to watch its progress. So I added periodic reports of progress through the RGs when converting inodes, and also an occasional "." when the RGs themselves are being analyzed and/or manipulated. Same goes for writing out the new gfs2 journals. The journals in gfs2 are quite different and we need to write them out entirely, all 128MB. That takes a fair amount of time going through the functions in buf.c. So I added a new message when each journal is written, so that the code doesn't appear stuck for a long period of time. This could still use some improvement, but it's better. I also modified the progress messages to make them more clear. For example, the message "Removing obsolete gfs1 structures" sounded to me like it could cause alarm or panic to a customer. So I rephased it to "Removing obsolete GFS1 file system structures". 4. Journal size was not being preserved In testing, I discovered that if the source file system had 32MB journals, the reformatted GFS2 file system had 128MB. I decided that's not good and decided to fix it. That is especially important if the file system is "full" to begin with because we can't add more blocks. 5. Fixed a segfault dealing with buffer management. 2008-12-01 Christine Caulfield cman: make it compile with latest corosync corosync_tpg_handle was renamed to cs_tpg_handle for some reason. cman: Don't crash cman_tool nodes -a cman was returning a success status but no data if the node id passed to cman_get_node_addrs isn't currently in the cluster. 2008-12-01 Florian Haas High: RA: VirtualDomain: new OCF RA (manage virtual domains using libvirt/virsh) 2008-12-01 Fabio M. Di Nitto fence: install virsh fence agent man page 2008-11-27 Steven Whitehouse GFS: Send useful information with uevent messages In order to distinguish between two differing uevent messages and to avoid using the (racy) method of reading status from sysfs in future, this adds some status information to our uevent messages. This patch makes the same changes as the recent GFS2 patch. 2008-11-26 Steven Whitehouse GFS: Send sensible sysfs stuff The fix to make gfs1 send the info that we need to process the uevents which it generates. 2008-11-25 David Teigland group_tool: fix dump dumping the groupd debug buffer broke a while back due to not reading the correct buffer size gfs_controld: cannot connect to dlm_controld error Log an error message stating the reason for failing to start when it cannot connect to dlm_controld. fenced/dlm_controld/gfs_controld: improve groupd waiting When polling groupd for the group_mode during startup, wait longer (30 sec) if we're getting an EAGAIN response from groupd, meaning groupd is still working on it. Wait for only 5 sec if we can't connect to groupd at all, which probably means it's dead. groupd/fenced/dlm_controld/gfs_controld: get logfile from ccs ccs_read_logging bug has been fixed, let it set logfile again. libccs: fix ccs_read_config The helper function for reading a string needs to memset the destination buffer to zero even when there's no new string to copy to it. The result of this was most obvious by the random data it would copy into the logfile name. 2008-11-25 Jan Friesse fence: Add libvirt (virsh) based agent Following agent is very different in way how it works from our xvm fence agent. This agent connects via ssh to dom0 and there run virsh, which performs required action. rhbz#472785 2008-11-25 David Teigland dlm_controld: recv error checking in process_uevent wasn't looking at errno. 2008-11-25 Steven Whitehouse gfs_controld: recv error checking in process_uevent wasn't looking at errno. 2008-11-25 Jan Friesse fence: fix IPMI over lan to support ciphersuite select If user select lanplus as IPMI protocol, ipmitool automatically select cipher type 3. This patch add possibility to select another type of cipher. New -C parameter is directly passed as -C parameter to ipmitool. rhbz#447497 2008-11-24 David Teigland groupd/fenced/dlm_controld/gfs_controld: log logging settings Put the logging settings in the debug log. ccs_read_logging() is corrupting the logfile parameter, so temporarily pass it a junk buffer. liblogthread: do nothing without init If logt_init() hasn't been called, or if logt_exit() has been called, do nothing with a logt_print() or a logt_conf(). fenced: log protocol message type in debug logs instead of "unknown". fenced/dlm_controld/gfs_controld: error handling in groupd detection If daemon is supposed to get group mode from groupd and can't, then exit with a failure. Also, retry groupd detection whether we get -EAGAIN back from groupd, or fail to get a version from groupd. fenced/dlm_controld/gfs_controld: log exiting message only once The "cluster is down, exiting" message should only be logged once by a daemon. groupd: libcpg mode can skip some more libgroup stuff In confchg for the "groupd" cpg, we should exit earlier when in libcpg mode instead of going through pointless libgroup steps. 2008-11-24 Lon Hohberger qdisk: Allow old logging style until next release qdisk: More misc cleanups. * Nuke README; information that used to be there is in qdisk.5 anyway * Nuke crc32.c and put the wrapper function in disk.c since that is the only place it is used. qdiskd: Misc. cleanups, esp. loop cleanups in main.c Also use the ../../common/liblogthread API for now at least until logsys v2 stabilizes 2008-11-24 Ryan O'Hara Fix check_mount to correctly test if device is mounted/busy. Attempt to open the device with O_EXCL flag. If errno is EBUSY, the device is busy/mounted. (BZ 240584) 2008-11-21 David Teigland groupd/fenced/dlm_controld/gfs_controld: startup info messages When each daemon starts have it log an INFO message showing its name and version. Also, have groupd consistently log and INFO message showing the the compat mode that's been selected. 2008-11-21 Lon Hohberger liblogthread: work with stderr etc v*printf can't be used twice between va_start and va_end, so this patch does the following: * call vsnprintf once * fputs() to stderr * send to _logt_print for queueing if needed 2008-11-21 Benjamin Marzinski gfs-kernel: workaround for potential deadlock. Prefault user pages The bug uncovered in 461770 does not seem fixable without a massive change to how gfs works. There is a lock ordering mismatch between the process address space lock and the glocks. The only good way to avoid this in all cases is to not hold the glock for so long, which is what gfs2 does. This is impossible without completely changing how gfs does locking. Fortunately, this is only a problem when you have multiple processes sharing an address space, and are doing IO to a gfs file with a userspace buffer that's part of an mmapped gfs file. In this case, prefaulting the buffer's pages immediately before acquiring the glocks significantly shortens the window for this deadlock. Closing the window any more causes a large performance hit. 2008-11-21 David Teigland groupd/fenced/dlm_controld/gfs_controld: don't retry ccs_connect These daemons had an infinite loop around ccs_connect() in setup_ccs(). This would cause the daemon to hang if corosync/cman crashed between setup_cman() and setup_ccs(). ccs_connect shouldn't need to be retried anyway since cman and ccs are now unified and we do the necessary waiting for cman (and therefore ccs) to be ready in setup_cman(). 2008-11-21 Lars Marowsky-Bree Prepare 2.99.3. Update specfile. 2008-11-21 Fabio M. Di Nitto cman: reenable stderr output in notifyd build: fix fence_node Makefile Add missing include dir and linking dir for liblogthread 2008-11-20 David Teigland fence_node: use logthread instead of logsys, and init the lib (and its thread) after calling into libfence which forks/execs (complicating running threads). 2008-11-20 Andrew Beekhof dlm_controld: add pacemaker support Adds code for dlm_controld to run under pacemaker instead of cman. Does not include any of the build-related changes. 2008-11-20 David Teigland groupd/fenced/dlm_controld/gfs_controld: log macros Restore the original method of creating the debug string with a time stamp that's used for the memory debug buffer and stderr. The call to logt_print can just use the fmt/args directly; using the temp debug string is pointless. 2008-11-20 Jan Friesse fence: fix IPMI man page Documented timeout (-t) option. 2008-11-20 David Teigland liblogthread: add LOG_MODE_OUTPUT_STDERR When set, logt_print() always prints whatever it gets to stderr. It's really out of place in this library, please feel free to not use it and print to stderr yourself. 2008-11-20 Jan Friesse fence: fix IPMI spawn /bin/bash rather than /bin/sh It's not guaranteed, that /bin/sh will be symlink to shell, which is bash (or ksh) compliant, so it's better to use /bin/bash. fence: fix IPMI typo in help Help contained description of -i option with comment, that this option is deprecated and you should use -i. This is nice recursion, but there should be -a. This fixes BZ #210687 fence: fix IPMI parameters containing special characters IPMI fence agent works by spawn a /bin/sh and ipmitool. If host name/password or any other command line argument included special shell characters (like $, ", ', ...) shell will try to substitute. This is not allowed behaviour and this patch fix it. Should fix BZ #447964 2008-11-20 Fabio M. Di Nitto cman: make init script stop cmannotifyd build: allow system to use zlib in non standard paths build: fix missing ${logtlibdir} for linking cman: port notifyd to new logthread api temporary disable output to stderr. 2008-11-19 David Teigland gfs_controld: new logging stuff Changes per recent discussion on cluster-devel mailing list. dlm_controld: new logging stuff Changes per recent discussion on cluster-devel mailing list. liblogthread: improve thread handling Found cleaner way of starting and stopping thread; just a more correct way of handling things overall it appears. groupd: new logging stuff Changes per recent discussion on cluster-devel mailing list. fenced: new logging stuff Changes per recent discussion on cluster-devel mailing list. liblogthread: time stamp when entry is added instead of when the thread does the write. liblogthread: new options Changes per recent discussion on cluster-devel mailing list. libccs: update ccs_read_logging Changes per recent discussion on cluster-devel mailing list, largely related to debug and logfile_priority. 2008-11-19 Lon Hohberger qdisk: Remove antique #ifdefs for old kernel-mode CMAN qdisk: Update man page. Nuke crc32 code and use zlib. * Show which options may be reconfigured while qdiskd is running (either from a cman config update callback or using kill -HUP). * Shipping our own crc32 algorithm is a waste of code space. qdisk: Make online reconfig actually work rgmanager: make max_restarts work w/o restart_expire_time qdiskd: Process reconfiguration events from CMAN qdiskd: Always use O_NONBLOCK when writing to status_file 2008-11-19 Jan Friesse fence: IPMI over lan timeout adjusted and configurable This patch adjust timeout to default value 10s which should be enough for most today IPMI implementations. It also removes retries, because this job is done by fenced. Because some devices still need longer timeouts, timeout is adjustable by parameter -t (or timeout for stdin and XML configuration). This should fix BZ: 401481, 276541 and 452894 2008-11-19 Christine Caulfield cman: fix signatures of cman_get_privdata & cman_set_privdata They do NOT take a pointer to a handle. 2008-11-19 Fabio M. Di Nitto xmlconfig: major rework cleanup completely our dependency on xpath. use XML internal data structure to get objects and keys associated to objects. speed up the whole config load time a lot. remote completely all exception handling that's now done using xml data instead. considerably reduce the amount of memory required to load the config. cleanup other bits (drop unrequired includes and invoke some XML cleanup code). 2008-11-18 Steven Whitehouse mount.gfs2: Remove unused ondisk2.c file Just removing an unused file libgfs2: Remove unused #defines Getting rid of some unused stuff. 2008-11-18 Andrew Beekhof Add files needed by automake Get things building Copy in autogen.sh from pacemaker Add a top-level make file Remove irrlevant portions of configure.in 2008-11-18 convert-repo update tags 2008-11-18 Fabio M. Di Nitto gnbd: remove from cluster project GNBD is now officially a separated project. Code for master branch can be found here: http://git.fedorahosted.org/git/gnbd.git build: change error string 2008-11-17 David Teigland fenced/dlm_controld/gfs_controld: config update reread Reread some daemon config options from ccs when we get a config-update callback. 2008-11-17 Lon Hohberger rgmanager: Avoid status checks during reconfiguration Ignore queued status checks if a configuration update is pending. Basically, a queued status check during reconfiguration could get a status check done before the update is complete, causing an erroneous service restart. 2008-11-17 Dejan Muhamedagic Medium: RA: drbd: return codes are unusable in concert the local builtin --HG-- extra : convert_revision : 18307a73ac6d29a505214a507efa8fd9fed11228 Low: RA: drbd: RA requires /bin/bash --HG-- extra : convert_revision : fc3af9364c0a5002253b9a8ab53cc747b5ae1f1e 2008-11-17 Fabio M. Di Nitto build: allow libs to have indipendent sonames by defining SOMAJOR and/or SOMINOR within the library Makefile, we can now override global setting. This can be useful if we need to bump the API/ABI of one single library. 2008-11-14 Lon Hohberger rgmanager: make clulog accept "-" as the first char in messages Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=471431 2008-11-14 Bob Peterson GFS2: gfs2_edit savemeta doesn't work with GFS bz 471239 This patch fixes several problems when saving and restoring GFS and GFS2 metadata when the block size is not the default of 4K. Some of the problems fixed: (1). GFS and GFS2 have different starting offsets and lengths for indirect block pointers. This required a number of code adjustments. (2). The journal index information is kept differently in GFS, so I had to change how the journal index buffers were managed. (3). The GFS dinode structure is slightly different from GFS2 and that was throwing off calculations because the di_height value was in the wrong location. Smaller block sizes made it worse (more blocks and a bigger height required for the same information). (4) GFS1 indirect pointer blocks look like metadata whereas GFS2's do not. As a result, gfs2_edit was improperly truncating those blocks for GFS1 (trying to avoid capturing user data versus metadata). 2008-11-13 David Teigland group_tool: show groupd compat info Show the groupd compatibility mode in output of group_tool ls. Suggest 'group_tool ls' if compat mode is 1 and daemons were queried (with -a) instead of groupd. Suggest 'group_tool ls -a' if compat mode is 0 and groupd was queried (without -a) instead of the daemons. fence_tool: refuse to leave if dlm lockspaces exist We already checked for mounted gfs/gfs2 filesystems and refused to leave if any were found. We should also refuse to leave if any dlm lockspaces exist. dlm_controld: clear plock syncing flags When a bunch of nodes all join a new lockspace together, no one will have any plock state, and no one will do a checkpoint to sync plock state to others. We just need to recognize when that happens and clear the need_plock and save_plock flags; the routine that completes the syncing process won't ever run to do it. 2008-11-13 Lon Hohberger rgmanager: Put init_resource_groups prototype in one place rgmanager: Remove polling code; misc cleanups Fix some uninitialized variables & function prototype mismatches. 2008-11-12 Marek 'marx' Grac [fence] Extension to fence agent for LPAR/HMC with 'list'/'monitor' operation [fence] Extension to fence agent for BladeCenter with 'list'/'monitor' operation [FENCE] Support for long options (eg. --ssh, --help) This patch adds a support for long options (using standard getopt) to all fence agents. Names of long options are not stable, so it is possible that they will change before merging to other branches. 2008-11-11 Dejan Muhamedagic Low (LF 1977): build: typos affecting the configure phase --HG-- extra : convert_revision : f38526cb44ba16c1fe4925767ec78aa8bad9cffd Low: ocf ra environment: protect variable expansion with quotes --HG-- extra : convert_revision : 4baf77a6f006b1bae4a492e46432f3beb4c2ba3d 2008-11-11 Fabio M. Di Nitto config: fix loading of multiple objects with no subojects libxml returns a different value for depending if foo has child entries or not. Use a bit of string magic to use the return value always in the same way. libfence: use ccs_connect instead of force_connect. force_connect without arguments is the same as ccs_connect. ccs_connect in 3.0 does not depend on quorum anylonger like it used to and it will allow connection as soon as cman/corosync is up and running. 2008-11-10 David Teigland libfence: no logging Libraries shouldn't do logging, and the two calls to syslog here are not essential. libfence: remove ccs reconnect code ccs no longer has the problem of connections timing out. 2008-11-10 Lon Hohberger rgmanager: Handle notifications from cman for config updates Without this patch, rgmanager spins because it doesn't know how to handle CMAN_REASON_CONFIG_UPDATE. This fixes it. rgmanager: Fix debug build error 2008-11-10 Fabio M. Di Nitto build: respect build: respect EXTRA_CFLAGS in cobj.mk 2008-11-10 Lon Hohberger rgmanager: Use CCS again instead of building everything NO_CCS rgmanager: Enable stderr logging when run in foreground 2008-11-10 Fabio M. Di Nitto libccs: cleanup fix return code init var fix comments rgmanger: fix build system Make rgmanger build again. Install clurgmgrd compat symlink. 2008-11-10 Lon Hohberger rgmanager: Rename clurgmgrd -> rgmanager End the madness. rgmanager: Nuke clurmtabd since it's not used/needed 2008-11-10 Christine Caulfield cman: replace high_nodeid with votes in transition message This is for forward-compatibility with future quorum service. high_nodeid was never used so I've use that slot for votes and incremented the minor number so we know what's going on. This patch is already in STABLE2 2008-11-08 Lon Hohberger liblogthread: Fix sefault if fopen() fails for any reason 2008-11-07 Lon Hohberger fence: Fix bug in make_args() A bug in make_args() caused it to always return an error code, even if we formulated a perfectly good set of arguments. Basically, eventually ccs_get_list returns -1 when all items are exhausted, so the check at the bottom of the function would free any arguments we had (correctly) set up and return an error code, thereby avoiding actually calling the agent. 2008-11-07 Ryan McCabe libfence: whitespace cleanup 2008-11-07 Lon Hohberger qdisk: fix block size check When using device="" instead of label="", this check was causing qdiskd to incorrectly exit. Resolves: #470533 [fence] Make fence_xvmd "reboot" work with newer versions of libvirt 2008-11-07 Jan Friesse fence: New fence agent for VMware using vmrun command. This agent is based on other idea, than previous one and works on VMware ESXi (previous one didn't). Main idea of previous agent was connect to ESX via ssh and there run vmware-cmd command. This command looks deprecated and maybe will not available in next version of ESX. ESXi have bigger problem, because ESXi doesn't have ssh administration console. This agent will directly connect to VMware via native API and doesn't need ssh allowed on ESX host. Problem is, that you must install vmrun command by hand (available from VMware web pages as installer) on every node you want to do fencing of VMware guest machines. 2008-11-07 Fabio M. Di Nitto build: fix kernel module install dir to respect DESTDIR Either this, or another invokation to KBUILD. This is faster :) init scripts: major rework to make them distro agnostic - adapt build system to generate init scripts on the fly to respect some installation paths - create top level headers to set vars and defaults for/from each distribution. - remove absolute path calls to standard distribution tools (pidof, kill, etc). - replace few functions from /etc/init.d/functions with local ones when those ones are not available. - rework the output redirection across board to fix several race conditions during normal operations. - fix some statements that are now invalid in bash3. - fix cman status to skip ccsd if it's not being selected specifically. - standardize LOCK_FILE all over. - fix killing process for qdisk and rgmanager to use standard distro tools. - substitute "action" with normal rtrn parsing in gfs and gfs2 init scripts. - fix clulog usage in rgmanager init script. - generally avoid to fork external process to gather return statuses. All of the above tested on Fedora 10, Debian and Ubuntu. 2008-11-06 Abhijith Das gfs-kernel: Bug 466645 - reproduceable gfs (dlm) hanger with simple stresstest GFS used to attempt to prefetch inode/iopen locks in readdir in the anticipation that stats will be called on the dirents returned. Running a simple 'find' without stat, resulted in wasteful prefetching and poor performance to the point that it seemed like find was hanging the system. This patch performs prefetch on a directory's inodes based on a stat-rate. i.e the rate at which stats are performed on the dirents returned by readdir. If there are a significant number of stats being performed, we enable prefetching. Otherwise, readdir is performed without prefetches. 2008-11-06 Ryan O'Hara BZ 453429: Fix conditional check of $OCF_RESKEY_migration_mapping to be double quoted. 2008-11-05 David Teigland dlm_controld: join should return error without fence domain If there's no fence domain, and a process tries to join a lockspace, we should return an error from join_lockspace. dlm_controld: fix the recent realloc fix in deadlock code It was just dropping the newly allocated memory on the floor. Now just use malloc to make it more obvious what's happening. dlm_controld: enable calls into deadlock code for handling a confchg during a deadlock cycle. Also remove some old compat code. 2008-11-05 Fabio M. Di Nitto build: prefer init scripts generated in the objdir rather than source This avoids a simple problem of building arch foo in objdir and arch bar in srcdir. Preferring objdir will make sure always to install the correct ones. 2008-11-04 David Teigland gfs_controld: simplify misc device handling and fix plock dump These are two things that went into dlm_controld but not into the duplicated code in gfs_controld. . depend on udev to create misc devices, so we remove a ton of complicated code that manages device nodes . fix plock dump, the dump size was never being set so the dump was always zero length 2008-11-04 Fabio M. Di Nitto ccs: simplify libccs reload code simple refactoring to make it simpler to plug into other functions. remove unrequired special casing. ccs: libccs implement reload operations 2008-11-04 Jan Friesse fence: Fix operation 'list' and 'monitor' for LDOM and ePowerSwitch This two fence agents had same method for getting power status and getting outlets. In method was simple checking, which operation should be processed, but main fencing library doesn't set this option to 'status' in case of powering outlet on or off. This led to bad behaviour in case of on, off and reboot operation. Operations status, list and monitor wasn't affected. 2008-11-04 Christine Caulfield cman: Tidy some english phrases and typos 2008-11-04 Fabio M. Di Nitto cman notifyd: export quorum information on statechange 2008-11-04 Christine Caulfield cman: Some edits to the cmannotifyd man page 2008-11-04 Fabio M. Di Nitto cman notify: wait for forked process to terminate.. .. and leave zombies around the system.. cman notify: fix a few bits in the shell area Fix cman_notify NOTIFYDIR typo. Fix cman_notify log redirection. Fix cman_notify_template.sh log redirection. cman notify: add logging to cman_notify save information on what we are doing when executing scripts. cman notify: add script template to doc/ This is a very simple template that can be generically useful to write notification snippets. cman notify: add note to man page cman notifyd: add man page 2008-11-03 Fabio M. Di Nitto fence scsi: plugin reload notification script into cmannotifyd cman notify: update init script start/stop/start for cmannotifyd cman notification: add shell and build infrastructure install cman_notify script in sbindir (this will be invoked by cmannotifyd. cman_notify.in: similar to debian run-parts and to fedora run-parts, executes scripts within a directory (CONFDIR/cman-notify.d). update build snippets to install and uninstall from cman-notify.d cman notify: add call back to external script fix also a bug in byebye_cman and add some debugging output. cman: add new daemon for notification to custom bits this is just the usual cluster daemon template. logthread: fix usage of syslog(3) and also fix a build warning. 2008-11-03 Christine Caulfield cman: Add some more comments about shutdown 2008-10-31 Abhijith Das gfs-kernel: bz466677 - fault in posix_lock_file() - "gfs_controld" responds to orphaned "plock_xop" request - suspected cause is patch for Bug 196318 gfs_lock() doesn't handle F_CANCELLK correctly (command not implemented). This patch implements the F_CANCELLK command in gfs_lock(). 2008-10-31 Fabio M. Di Nitto rgmanager: Fix smb.sh shell scripting 2008-10-31 Takenaka Kazuhiro High: RA: Squid: new OCF RA --HG-- extra : convert_revision : 2ba71621b4983137be77ef362c088971d4105059 2008-10-31 Christine Caulfield cman: Some fixes for configless running Create /cluster if it doesn't exist (and don't bother trying to copy any subkeys into CONFIG_PARENT_OBJECT). Also don't reload keys if we are running configless. 2008-10-31 Fabio M. Di Nitto groupd: handle cman config update notifications allow at least logging to be reconfigured runtime gfs_controld: handle cman config update notifications allow at least logging to be reconfigured runtime dlm_controld: handle cman config update notifications allow at least logging to be reconfigured runtime cman: update man page for reload operations cman: implement and simplify configuration reload operations cman: implement reload operations for cman-preconfig add first cut for reload operations. cman: make cluster_parent_handle function specific this avoids different problems at reload time, like holding a reference to an object that does not exist anymore. 2008-10-30 Fabio M. Di Nitto gfs2: randomize creation of temporary directories for metafs mount more a07d8d56e945a265f3da2857ad1316f49c4ae157 didn't add enough security to the whole random mount point. This change: - introduces a better randomness in mount_gfs2_meta by using mkdtemp (this is a required change for security reason). - the use of mkdtemp makes the whole dir_exists code unrequired (hence removed by the patch). - we force each tool to create its own meta mount. This fixes any possible race conditions between tools (and thanks to the use of mkdtemp that will guarantee the creation of a unique mount point). It also makes find_gfs2_meta function unrequired (hence removed by the patch). - cleanup struct gfs2_sbd of unrequired fields. - cleanup the cleanup_metafs code path. - cleanup exit path from mount_gfs2_meta. - simplify code around different tools by using mount_gfs2_meta only. - turn lock_for_admin static. rgmanager: randomize svclib_nfslock temp dir by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Randomize temp dir via mktemp. rgmanager: randomize smb.sh temp file by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Randomize temp files via mktemp. misc: fix mktemp usage rgmanager: randomize oracledb.sh temp file by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Randomize temp files via mktemp. rgmanager: randomize SAPDatabase temp file even more ad4d9bd9216d6fa259118017bac8b5c032b7b3b9 didn't introduce enough security. Switch to mktemp(1). rgmanager: randomize ASEHAagent temp files even more f8943d17d4c108d44af7a42ad0fd646d4e75990e didn't didn't introduce enough security. Switch to mktemp(1). rgmanager: move state dump file where it belongs commit bde3b975fc4caffaa2aabd82d7d3cda8f4432f6a did try to mitigate a possible DoS in the wrong way. Move the file where it belongs instead. rgmanager is a "long time" running daemon and the pid is known upfront making the old fix pointless. gfs2: randomize file for savemeta operations even more 15f9eb851b8924271095b159b2d077a9a0595cd5 didn't introduce enough security. Switch to mkstemp(3) and cleanup unrequired code as a consequence. gfs2: randomize debugfs mount point even more 18b24ae55c3e4abdc256a3b6c4f15ae0116a0f14 didn't introduce enough security. Switch to mkdtemp(3) and cleanup unrequired code as a consequence. ccs: cleanup ccs_read_logging simplify the code a lot based on Dave implementation. use a buffer for file instead of passing pointers around. Makes life easier. move all common stuff (yes-no / on-off / get string) into their own functions. 2008-10-30 Marek 'marx' Grac [fence] WTI should not power on/off plug if it is unable to get status Fix #468904 - I have to remember difference between and/or [fence] WTI should not power on/off plug if it is unable to get status Fix #468904. On some WTI devices plugs are numbered as 1,2,3 and on the others as A1,A2,...B1,.... Both types accept numbers (A1 = 1, B1 = [number of last A] + 1). Power on/off works with numbers but if we want to parse status of plug then we have a problem. This patch is a general solution (fencing library) because it tests value of get_status() which have to be on/off otherwise we exit with new error code. 2008-10-29 David Teigland fenced/dlm_controld/gfs_controld: query thread mutex should be held around some places it wasn't and can be released in some other places it wasn't. 2008-10-29 Fabio M. Di Nitto xmlconfig: fix buffer overflow when reading huge config files it was possible to overflow a buffer when adding more than 52 entries within the same xml block: fix the overflow by turning the whole allocation dynamic rather than static. This will allow only limits imposed by the system memory. ccsais: fix buffer overflow when reading huge config files it was possible to overflow a buffer when adding more than 52 entries within the same xml block: fix the overflow by increasing the limit to 1024 and fail to start if we hit the limit. ccs: add ccs_read_logging this function is a simple wrapper to gather all logging configuration from one central place since this code is replicated already N times across the tree. the function should always be invoked with the selected default values and it will apply overrides from the config (if available and if possible). you are responsible for freeing char **file when changed via cluster.conf. build: use standard syslog priority name rather than corosync this allow us to use external logging entities and retain compatibility 2008-10-28 David Teigland gfs_controld: move log_error message Limit the error message to the case that shouldn't happen. It was being printed in cases that can happen during normal operation. 2008-10-28 Fabio M. Di Nitto common: plug liblogthread in the system add toplevel common/ infrastructure for bits that needs to be built before anything else and that they are "common" to several daemons. code that lands in here should not depend on anything else in the tree. plug liblogthread into common/ and the build system. ccs: libccs major rework pass 5 lindent the code.... ccs: remove duplicate entry in internal header file ccs: libccs major rework pass 4 move all helper functions at the top of libccs, and the public API at the bottom. make _ccs_get static ccs: libccs major rework pass 3 split fullxpath code into fullxpath.c ccs: libccs major rework pass 2 split xpathlite code into xpathlite.c add ccs_internal.h to define internal API. ccs: libccs split ccs_lookup_nodename into extras.c extras.c will collect wrappers for libccs that do not provide libccs core functionalities. ccs: libccs major rework pass 1 return real ccs_handles instead of fake ones. This will allow one application to have more than one concurrent connection to ccs. reinstante tracking as it was provided in the past by ccsd. move almost all global variables into function specific ones. make ready for splitting xpathlite and fullxpath into their own files. make ready to split xpathlite parsing into separate functions. make ready to split helper functions to access confdb into its own file. make it easier to debug by storing info into the objdb. cman: add /libccs/@next_handle support next_handle will be atomically incremented by corosync to return ccs_handle down the pipe. We create it in cman-preconfig to avoid an "init" race in libccs. 2008-10-27 David Teigland groupd/fenced/dlm_controld/gfs_controld: init logging after fork Initializing logging creates threads which is complicated by forking. Move logging init after fork, since we don't use it until then anyway. 2008-10-27 Jan Friesse fence: Operation 'list' and 'monitor' for Alom, LDOM, VMware and ePowerSwitch None of this devices returns alias with list operation, because it doesn't make any sense (LDOM, VMware) or it isn't possible to get that information (ePowerSwitch). fence: Fix -C switch description in Python library Separator between -C and description was tab key. It looked very bad in console, because -C description was not aligned with other descriptions. 2008-10-27 Lars Marowsky-Bree IPaddr2: support IPoIB gratuitous arps This patch allows the configuration of a heartbeat based NFS export of CXFS clients on a cluster using NFS/IPoIB. It sends gratuitous arps usin IPoIB by using ipoibarping instead of send_arp, which relies on libnet and doesn't know IB. --HG-- extra : convert_revision : 6ec2b3fe81431eba46c776e46c2ba642515ff0a6 2008-10-27 Fabio M. Di Nitto ccs: implement config reload in legacy ccs add reload operation to legacy ccsais config plugin. xmlconfig: remove debugging fprintf 2008-10-24 David Teigland dlm_controld: fix plock dump The size of a plock dump was always zero so nothing would be sent. 2008-10-24 Lars Marowsky-Bree Remove references to removed man pages from configure.in. --HG-- extra : convert_revision : 364683b923326b0abc527750ecc647fb67713af1 2008-10-23 Marek 'marx' Grac [fence] Operation 'list' and 'monitor' for WTI IPS 800-CE This should work also on other WTI devices but currently I don't have any other. [fence] Operation 'list' and 'monitor' for iLO, DRAC5 and APC Operation for listing available outlets works also for single-unit fencing devices and returns N/A (can be changed to anything else). Option 'monitor' can be used for monitoring health of the fencing device. It does not output anything and will perform operation 'list' (on multi-port devices; ef. APC) or operation 'status' (on single-port devices; eg. iLO). 2008-10-23 Jan Friesse fence: Fixed case sensitives in action parameter. Some agents use case sensitive action parameter. This patch makes that agents case insensitive and fix their man pages to reflect this change. 2008-10-23 Lars Marowsky-Bree Cleanup metadata and remove reference to status and promotion. --HG-- extra : convert_revision : e7713141712bf9ed7447c76ecdc74358e7fa6977 scsi2reservation: New RA from Xinwei Hu. --HG-- extra : convert_revision : a71ddcd5a984e832e882bf2e645ae8cdb34a4a98 Add nfsserver RA to makefile. --HG-- extra : convert_revision : a8cc3de3d43689d84d8b95c725a065c2ec6334bb nfsserver: New RA from Xinwei Hu. --HG-- extra : convert_revision : 90af633d8164f9365bf94505f1fb89ae4c79fbf1 sfex: Manage exclusive access to disks. Improved by Xinwei Hu, originally posted by OKADA Satoshi. --HG-- extra : convert_revision : 9f801f86150c8b6fc920fc31fcc14265dee6e158 2008-10-22 Jonathan Brassow rgmanager (HALVM): Stop dumping debug output to /tmp Remove some left-over debugging 2008-10-22 Jan Friesse fence: Added support for no_password in fence agents library and fence_eps. Some fence devices don't need login name and password for fencing (this is generally bad idea, but ePowerSwitch 8M+ is good example). 2008-10-22 Christine Caulfield cman: fix two_node startup if -e is specified two_node startup could fail because the vote_sum variable isn't calulated if expected_votes is specified on the command-line. Now we always iterate the nodes list to get the vote_sum. 2008-10-22 Fabio M. Di Nitto libgfs2: randomize creation of temporary directories for metafs mount by using a static path to /tmp, the operation can fail in different ways. randomize the path a bit by using the invoking pid. This will also allow multiple simultaneous invokation of mount_gfs2_meta on different mountpoints. Similar to 18b24ae55c3e4abdc256a3b6c4f15ae0116a0f14 there is a small race condition in this implentation. Implementation: - Fix return info from find_gfs2_meta to set TRUE when we find a mountpoint in gfs2_sbd struct. - Add metafs_created_mount to gfs2_sbd struct to propagate info if we did create the mount point or not. - Randomize sdp->metafs_path with pid info. - Remove the metafs_path mount point if we did create it. rgmanager: randomize SAPDatabase temp file by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Mitigate the issue by randomizing the temp files with pid. build: reinstate targets in rgmanager metadata check rgmanager: move oracledb.sh log files where they belong by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Move the log files to LOGDIR where they are expected to be. LOGDIR is generally owned by root and doesn't allow normal users to play with it. rgmanager: move nfsclient.sh cache files where they belong by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Move the cache files to /var/cache/cluster that's owned by root and doesn't allow normal users to play with it. rgmanager: move fs.sh log file where they belong by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Move the log files to LOGDIR where they are expected to be. LOGDIR is generally owned by root and doesn't allow normal users to play with it. rgmanager: randomize ASEHAagent temp files by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Mitigate the issue by randomizing the temp files with pid. 2008-10-22 Jim Meyering handle some malloc failures "Fabio M. Di Nitto" wrote: > Merged into master branch. I had some issues to get them in the tree > so the SHA1 are different from your original posting but it's all > there. Thanks, Fabio. Here's a tiny adjustment: >From 01c30afac0bf97d84df11cc32773304a99613a08 Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Mon, 13 Oct 2008 15:22:55 +0200 Subject: [PATCH] adjust a diagnostic * group/dlm_controld/deadlock.c (add_waitfor): Tweak diagnostic. 2008-10-22 Fabio M. Di Nitto rgmanager: randomize file for automatic data dump by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Mitigate the issue by randomizing the output file with pid. gfs2: remove unused define gfs2: randomize file for savemeta operations by using a static path to /tmp, the operation can be used to trigger a local DoS by a normal user. Mitigate the issue by randomizing the output file when none is specified. gfs2: randomize debugfs mount point by using a static path to /tmp, the operation can fail in different ways. randomize the path a bit by using the invoking pid. This will also allow multiple simultaneous invokation of lockdump operations. There is still a small window for a race where one gfs2_tool invokation has created the mount point, the second one finds the mount point, the first one succeed in an umount and the second one will not be able to access the data. This is a very unlikely condition to happen and it's not mission critical so we can live with it. fence: update man page for fence_apc we don't save logs there anymore 2008-10-21 Jan Friesse [fence] Fixed man pages makefile, so fence_eps.8 is now installed. [fence] Fence agent for ePowerSwitch 8M+ (fence_eps) bz 467112 Fence agent for ePowerSwitch 8M+ works only on 8M+ device, because this is only one with hidden page (this feature has to be allowed, otherwise agent will not work) feature support. 2008-10-21 Fabio M. Di Nitto misc: fix gfs2_edit build broken by wrong copyright update commit 2008-10-21 Jim Meyering add comments marking unchecked strdup calls * config/tools/ccs_tool/editconf.c (add_fence_args): Remove unused local variable, buf, * gfs2/edit/savemeta.c (save_inode_data): ...along with malloc and free. add comments marking unchecked malloc calls * gfs/gfs_fsck/super.c: * gfs/libgfs/fs_dir.c: * gfs/libgfs/inode.c: * gfs/libgfs/super.c (ji_update): * gfs2/edit/hexedit.c (display_indirect): * gfs2/edit/savemeta.c (gfs1_rindex_read): * gfs2/fsck/initialize.c (init_system_inodes): * gfs2/libgfs2/super.c (rindex_read): * group/daemon/app.c: * group/daemon/cpg.c (deliver_cb): * group/daemon/joinleave.c (new_node): * group/daemon/main.c (do_get_groups, do_send): * group/dlm_controld/plock.c (unpack_section_buf): * group/gfs_controld/plock.c (unpack_section_buf): * src/daemons/clurmtabd_lib.c (rmtab_insert): * src/daemons/dtest.c (dtest_shell, main): remove dead code (useless test of memset return value) * gfs/gfs_fsck/block_list.c (block_list_create): memset can't fail. There are many more like this. * gfs/gfs_fsck/inode.c (check_inode): handle failed malloc * fence/agents/xvm/ip_lookup.c (add_ip): Handle malloc failure. don't dereference NULL upon failed realloc * gfs/tests/filecon2/filecon2_server.c (main): Fix typo (s/sock/ssin/) that would make a failed realloc cause a NULL dereference. * gnbd/tools/gnbd_export/gnbd_export.c (execute_uid_program): Diagnose a failed realloc. * group/dlm_controld/deadlock.c (add_waitfor): Handle failed realloc. 2008-10-21 Fabio M. Di Nitto misc: cleanup copyright.... again 2008-10-20 Simone Gotti [rgmanager] Fix fuser parsing on later versions of psmisc Description of problem: fuser from rhel5 has different output respect the one from rhel4 so fs.sh needs a little change. Looks like the 2 differences are: *) mountpoint has an ending colon *) Everything except the pid file is printed to sdterr instead of stdout Bugzilla #467686 2008-10-20 Satomi TANIGUCHI High: new stonith plugin: external/kdumpcheck --HG-- extra : convert_revision : a29f1b78dfe5f98e3aef15358da920db4ab0f8c4 2008-10-20 Fabio M. Di Nitto rgmanager: fix build after port to logsys 2008-10-20 Abhijith Das gfs-kernel: bz 458765 - In linux-2.6.26 / 2.03.06, GFS1 can't create more than 4kb file Temporary workaround fix to make gfs1 work properly in upstream kernels until we do it the right way by using the new aops write_begin/write_end instead of the prepare_write/commit_write interface that we currently use. 2008-10-17 Marek 'marx' Grac [fence] Operation 'list' for APC fence agent Operation 'list' return list of available outlets (with aliases) or name of virtual machines. This patch includes changes for fencing library and changes for APC fence agent. There is third argument for fence_action() that is None by default so we don't have to change other fence agents that do not have support for this operation, yet. In some cases we can overload get_power_status() as we just need to cache result in other cases it will be completely separate function (eg. 'xm list'). @note: Current implementation for APC will list just outlets on one device. If we have more than one switches (MasterSwitch) we will see outlets just on this device. 2008-10-17 Fabio M. Di Nitto fence egenera: fix logging file Move log file where it belongs with all the others. Addresses: CVE-2008-4192 2008-10-17 Simon Horman Add AM_PROG_CC_C_O to configure.am As advised by the following warning: automake (GNU automake) 1.10.1 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv2+: GNU GPL version 2 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Tom Tromey and Alexandre Duret-Lutz . heartbeat/Makefile.am:57: compiling `heartbeat.c' with per-target flags requires `AM_PROG_CC_C_O' in `configure.in' --HG-- extra : convert_revision : a4431b4f174920a198b60c1e11ed2da2fb531f5d 2008-10-15 Lon Hohberger [fence] Fix fence_xvmd trying to read wrong args from ccs 2008-10-14 David Teigland fenced/dlm_controld/gfs_controld: modify a debug message show the cg seq number in the wait_messages debug line to make it clear which cg it's counting messages for gfs_control: improve ls output copy what was done with fence_tool and dlm_tool. 2008-10-14 Christine Caulfield cman: fix a couple of unhandled malloc failures Thanks to Jim Meyering for the patches. 2008-10-14 Abhijith Das gfs-kernel and mount.gfs2: GFS ignore the noatime and nodiratime mount options Since the vfs moved the MS_NOATIME, MS_NODIRATIME flags from the superblock to the vfsmount structure (MNT_NOATIME, MNT_NODIRATIME, which are not accessible to gfs), gfs no longer knows when to enable/disable atime updates and the atime_quantum stuff is broken in the sense that it doesn't respect the noatime and nodiratime flags. This patch attempts to fix this by creating a gfs-specific mount option gfs_noatime and having the mount.gfs helper pass it along when the user specifies noatime or nodiratime in the command line. It's not the ideal way to fix it, and it is a bit ugly. Revert "gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options" This reverts commit 324a4ffc12821ddde1f583a214e98ba9c8c2540c. 2008-10-10 David Teigland dlm/fence: daemon fixes and tool improvements fence_tool/dlm_tool: improve info in ls output fenced/dlm_controld: fix confchg/message processing, must be done after each individual confchg/message dlm_controld: fix fencing checks which weren't happening dlm_controld: improvements to recovery debug messages 2008-10-10 Steven Whitehouse libgfs2: Add support for UUID generation to gfs2_mkfs Uses /dev/urandom to create 16 byte UUIDs for GFS2 filesystems at mkfs time. Backwards and forwards compatible with all GFS2 filesystems. You'll need a set of kernel headers with the new field defined in order for this feature to be enabled. Bugzilla #242690 2008-10-09 Marek 'marx' Grac [RGMANAGER] - Fix #462910 postgres-8.sh and metadata fixes 2008-10-08 David Teigland daemons/tools: misc minor cleanups and improvements fenced/fence_tool: fix and improve output of ls daemons: don't attempt cpg exit cleanup after cluster goes down daemons: fix lazy memset size args to avoid mistakes dlm_controld: clean up daemon cpg on exit fence_tool/dlm_tool/gfs_control: remove error message for the 'ls' command when the daemon isn't running. 2008-10-07 Satomi TANIGUCHI Low: build: add drac5 to configure.in --HG-- extra : convert_revision : 1eae6aaf1af8232a0108dc932e156ab5e8b69abb 2008-10-06 Ryan McCabe cman: Fix typo that caused start-up to fail "/usr/sbin/cman_tool" should be "/sbin/cman_tool" 2008-10-06 Abhijith Das gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options This patch corrects noatime support in GFS. It works by ditching the special casing which we had previously and using the support already in the VFS layer. The problem which this solves was introduced a while back when noatime became a per-vfsmnt flag rather than a per-sb flag. GFS was still assuming that this flag would be passed to the fs, whereas in reality it was being horded by the vfsmnt code and not being passed to the fs. As a result of this patch, GFS will not only obey the noatime flag correctly, but also relatime and these will also be supported per-vfsmnt rather than on a per-sb basis as before. This fixes bz #457473 2008-10-04 Lars Marowsky-Bree RA: Filesystem: ocfs2: detect openAIS versus heartbeat stacks correctly. --HG-- extra : convert_revision : 83a87f2b6554c4b9e86e37908d93f47d0514bed8 2008-10-02 Lars Marowsky-Bree RA: o2cb: Remove, as it was just proof-of-concept code which never worked, but greatly confused users. --HG-- extra : convert_revision : 7ef4506f714e550b098d52d6b951767b6e9c9369 Update version to 2.99.2. --HG-- extra : convert_revision : e7b90a3632163071f4162c13f835f98cf4a28a6d RA: Filesystem: Tolerate "notify" being enabled for non-OCFS2/SLES10 environments. They are a waste, but harmless, so no need to break existing setups. --HG-- extra : convert_revision : 7b88dc7fe98601613a87a0200592fc39625bf893 RA: Filesystem: No longer advertise the OCFS2/SLES10 specific parameters. If they haven't needed them until now, they will not need them going forward ;-) --HG-- extra : convert_revision : 4c939638c5965b8a4cb127c71d4c158982dc5b0c RA: Filesystem: Make OCFS2 user-space extensions dependent on the SLES10 compatibility mode. This allows users of SLES10 to also use the newer heartbeat releases, while also working with the newer openAIS stack. --HG-- extra : convert_revision : 0ef1c8b13586477fd0bb573920d312c8f04555f8 2008-10-02 Abhijith Das gfs-kernel: GFS: madvise system call causes assertion Since the madvise system call was enabled by the patch to bug 429343, it's possible for a inode glock holder to never get dequeued through gfs_readpage. This causes an assertion (bug 464837) GFS: fsid=cl102a:gfs1.1: warning: assertion "(gh->gh_flags & LM_FLAG_ANY) || (tmp_gh->gh_flags & LM_FLAG_ANY)" failed GFS: fsid=cl102a:gfs1.1: function = add_to_queue GFS: fsid=cl102a:gfs1.1: file = /builddir/build/BUILD/gfs-kmod-0.1.23/_kmod_build_/src/gfs/glock.c, line = 1418 GFS: fsid=cl102a:gfs1.1: time = 1222739610 This patch reverts the patch to bz 429343. Don't log any warnings/errors and simply return ENOSYS when you arrive at gfs_readpage without the inode glock held. (madvise syscall case) 2008-10-02 Bob Peterson gfs-kmod: GFS corruption after forced withdraw bz 452274 GFS file systems were being corrupted because some of the functions in log.c were writing to the journal after the file system had been withdrawn. 2008-10-02 Lars Marowsky-Bree RA: Filesystem: Correct exit code used when trying to cluster-mount a non-clustered filesystem. --HG-- extra : convert_revision : 6aaa222ec1a0b27901198701287b3476f4d28dca Back-out merge. --HG-- extra : convert_revision : bc9e6aea00009368382162201d540aebfc203539 Re-instating OCFS2 special hooks. --HG-- extra : convert_revision : be3035796387a2757ab245687ebbbf322c08bf43 Back-out merge. --HG-- extra : convert_revision : ac86e1791e8ac9bd0f5bdd3e1a2db6c2b6ab8cd4 Backed out changeset 0f3bb32fd483 --HG-- extra : convert_revision : fb60def123f41133e629af51656538c0f0982e69 2008-09-29 Ryan O'Hara fence_scsi: correctly declare key_list The key_list hash was not being declared in the get_key_list subroutine, which was causing problem with the scope of the variable. In short, Perl assumes that the scope of an undeclared variable is global. This caused the contents of the has to remain unchanged. (BZ 462628) fence_scsi: improve logging for debugging Using the -v (verbose) option will print more information that it previously did. Output will also be easier to understand, which should help track down any problems that might occur. 2008-09-29 rohara scsi_reserve: add restart option Added restart option to scsi_reserve init script. Using this option will result is re-registering with all devices. It will not remove any existing registrations, since doing so would be dangerous. In short, it is nearly identical to calling the script with the "start" option. (BZ #455330) fence_scsi.pl: check if nodeid is zero If the nodeid we get from the XML query of cluster.conf is zero, then either the node does not exist in the cluster of the nodeid is not set. Each case is invalid, so report an error and exit. 2008-09-29 Ryan O'Hara cman: allow custom xen network bridge scripts This patch allows users to define custom scripts for Xen network bridging. Previously, the name of the Xen network bridge was hard-coded in the cman init script. Users that wish to use custom Xen network bridge script should define NETWORK_BRIDGE_SCRIPT in /etc/sysconfig/cman. This script must exist in the /etc/xen/scripts directory. Users must also update the /etc/xen/xend-config.sxp file accordingly. 2008-09-26 Lon Hohberger rgmanager: don't change the build target just yet rgmanager: make clulog build even though it's incomplete group: Allow group_tool ls to be scriptable * Returns 1 if the group is not found or found and not joined * Returns 0 if the group is found and joined This is needed to solve rhbz 459754. rgmanager: First pass at port to logsys logging.c is from group/*_controld with a few mods clulog (command line utility) is not done yet. 2008-09-26 Dejan Muhamedagic Low: build: disable inlines in netsnmp if gcc version is 4.3.0 or higher --HG-- extra : convert_revision : 3127c6aa43a22db99d6ad2da9667c303613f55d9 Low: build: fix extracting gcc version --HG-- extra : convert_revision : 4c368b8ac53fa10d0d95f93f6dbaf1c4e882dbd2 2008-09-26 Marek 'marx' Grac [FENCE] Fix #290231 - "Switch (optional)" param does not default to "1" and program fails Bug itself was almost corrected in new fence agent but unfortunately '1' was entered as number not as string. Python can not do int + string and agent fails. 2008-09-25 David Teigland fenced/fence_tool: improve list info Add wait condition to the output, and make the current victim visible for the full duration of fencing. 2008-09-25 Bob Peterson GFS: gfs_fsck invalid response to question changes the question bz 463817 - gfs_fsck can't decide which bitmap to fix When the gfs_fsck ran into a problem and asked whether to fix it, if the users gave an invalid response, the block referenced in the question would become a random number. That's because in function "query" it was parsing the arguments once, using the va_start function, but after the arguments have been parsed, it's left in an invalid state. The proper thing to do is to call va_start for each time we need to parse the arguments. 2008-09-25 David Teigland fenced: add protocol negotiation Same as dlm_controld and gfs_controld, but without the kernel protocol. dlm_controld: add protocol negotiation Exact copy from gfs_controld. Also fix a check in gfs_controld. The kernel proto checking is not yet linked to the actual kernel version. 2008-09-25 Jan Friesse fence: New fence agent for Logical Domains (LDOMs) It's tested on LDOM 1.0.3. Because interface is backward compatible, it will work with 1.0, 1.0.1 and .2 too. It's tested with bash and csh shells on host machine. 2008-09-25 Fabio M. Di Nitto dlm/fence/gfs: fix daemon spinning 100% due to memory corruption This is more a workaround than a real fix. When building with -O0, arg list to pthread_create is somehow corrupted (I suspect a gcc bug here as the problem doesn't show with any other -O levels), and data passed down to process_query are invalid. Stop passing arguments via pthread_create. Add simple sanity check and fallback in process_queries on accept() call that was the main cause of the spin. 2008-09-25 Dejan Muhamedagic Low: RA: apache: envfiles attribute to source extra environment (e.g. envars) --HG-- extra : convert_revision : f2bf43fb3e643ab3c2fe9a06663d6348034ad6c7 High: RA: LVM: stop correctly in case vol group does not exist --HG-- extra : convert_revision : 08e97e4d70f4230457b21017466d1c3c1c302cef 2008-09-24 David Teigland gfs_controld: add protocol negotiation For both daemon and kernel protocols, although the kernel protocol is not connected in any way to the actual kernel version yet. 2008-09-24 Lon Hohberger cman: Don't let qdiskd update cman if the disk is unavailable rhbz#460937 cman: Fix broken qdisk main.c patch reverted with scandisk merge Re-fixes 442541 rgmanager: Resolve hostnames->IPs and back when checking NFS clients Also enable caching for improved performance in services with lots (hundreds) of individual mounters Bugzillas #246668 & #455324 rgmanager: Make clustat and clusvcadm work faster rhbz#461956 rgmanager: Implement enforcement of timeouts on a per-resource basis Set "__enforce_timeouts" to "1" in the resource tree in order to enable this behavior (e.g. not the global resources list). rhbz #455326 rgmanager: Clean up build General build cleanups. Also fixes small bug in check_rdomain_crash(). rgmanager: make status poll interval configurable This allows administrators to define an alternate poll interval; the default is 10 seconds. This has no functional change unless an administrator sets: rgmanager: Fix up clusvcadm.8 manual page to show -M option rhbz#460032 rgmanager: Wait for fence domain join to complete rhbz 459754, take 3 rgmanager: Permit careful restart w/o disturbing services ... e.g. for upgrades of rgmanager in-place for example. Note: Requires service-freeze patch Example use: * Manually freeze all services on a node. * Stop rgmanager (service rgmanager stop) * Upgrade rgmanager package * Manually start rgmanager from the command line 'clurgmgrd -N' rgmanager: Detect restricted failover domain crash Mark service as 'stopped' when it is 'running' but the node is down. rhbz #435466 rgmanager: Make freeze/unfreeze work with central_processing Part 2 of 2 for rhbz 448046 2008-09-24 Bob Peterson GFS2: gfs2_fsck: fix segfault while running special block lists. bz 463588 - GFS2: gfs2_fsck segfaults when extended attributes are on the file system The gfs2_fsck tool was running special block lists with osi_list_foreach but then it was sometimes deleting the entries from the lists. Therefore it should have been using osi_list_foreach_safe instead. 2008-09-24 Fabio M. Di Nitto misc: cleanup ifdefs around RELEASE_VERSION RELEASE_VERSION is always set at build time now. Make it consistent across the whole tree. fence: install fence_alom man page update build to install fence alom man page fence: update alom description add CMT version information in the alom header 2008-09-24 Lon Hohberger cman: show '-d' option in mkqdisk -h and mkqdisk.8 rhbz 459678 2008-09-23 Dejan Muhamedagic Low: RA: Filesystem: fix broken xml --HG-- extra : convert_revision : 0f3bb32fd483ba7d30e0c891b541974bfc4f3dd5 2008-09-22 Jan Friesse Fence: Added fence agent for Sun Advanced Lights Out Manager (ALOM) Because of strange behavior of ALOM SSH, which behaves more like telnet, fencing.py library is changed too. There is new option telnet_over_ssh (accessible only from source code of agent), which supports this behavior. fence: Fix fence agent for VMware ESX. Added support for identity_file. 2008-09-19 David Teigland gfs_controld: ignore uevents after first_done Ignore extraneous uevents after we see first_done; it's cleaner that way. Munging more debug messages to improve reading/debugging. gfs_controld: withdraw and recovery fixes Add handling for withdraw to new code, along with fixes to related recovery code. 2008-09-19 Christine Caulfield cman: Clean shutdown_con if the controlling process is killed. If a shutdown is initiated by a process that is then killed, the shutdown_con isn't cleared. So if another process replies to the shutdown request cman could segfault. 2008-09-17 David Teigland dlm_controld: ignore old plock dev when using new one Don't require /dev/misc/lock_dlm_plock to exist when we've found /dev/misc/dlm_plock. 2008-09-17 Abhijith Das Revert "gfs-kernel: bz298931 - GFS unlinked inode metadata leak" This reverts commit b32003be3784835ca1e79a490e052210303268ac. 2008-09-16 Jan Friesse fence: Fix fence agent for VMware ESX. Use dynamic import path and RELEASE_VERSION variable. cman: Removed old Perl version of VMware fence agent, so new version is built. 2008-09-15 Chris Feist cman: fixed makefiles to actually install the vmware manpage 2008-09-15 Christine Caulfield cman: rename 'move' functions to 'copy' After some though I decided that removing the original config tree was unhelpful. And so I've also renamed the functions that do the copy to 'copy_' from 'move_* for clarity. 2008-09-13 Raoul Bhatia [IPAX] Medium: RA mysql: fix a typo --HG-- extra : convert_revision : 2f815d538ea119a154f3dc1c67dcba0553ae965b 2008-09-12 Abhijith Das gfs-kernel: bz298931 - GFS unlinked inode metadata leak Have inoded reclaim metadata from x rgrps at a time The tunable max_rgrp_free_mdata is the maximum number of rgrps to free unused metadata from during each cycle of inoded. Default is 5. libgfs2: Bug 459630 - GFS2: changes needed to gfs2-utils due to gfs2meta fs changes in bz 457798 The changes to the gfs2meta component of gfs2 (through bz 457798) do not require any major changes to gfs2_utils except this one liner. We now use the gfs2 mount point rather than the device to mount the meta fs. 2008-09-12 Simon Horman Merge with upstream --HG-- extra : convert_revision : b6d762e738059059049e1df01e599b3d4a2f3345 2008-09-11 Marek 'marx' Grac [FENCE] Fix #460054 - fence_apc fails with pexpect exception In some special unspecified cases it is possible that connection will be closed before we run close(). This is not a problem because everything is checked before. 2008-09-11 Christine Caulfield config: Get rid of files I committed accidentally. You know my opinion of git. cman: Copy "service" keys down to corosync Allow the user to specify multiple keys in cluster.conf, to include extra services. config: Allow multiple top-level keys XML & CCS config stopped reading top-level objects if it came across a duplicate. This was due to a break where a continue should have been. 2008-09-11 Jan Friesse fence: Fence agent for VMware ESX Another fence agent for VMware ESX which is written in Python using our fencing library. Old agent (written in Perl) segfaulted in my test environment. This agent is tested on VMware ESX 3.5 and Server 1.0.7. bz 251048 2008-09-11 Lars Marowsky-Bree Filesystem: Remove OCFS2 specific extensions. (No longer needed for upstream pacemaker+ocfs2) --HG-- extra : convert_revision : 316866b49bbce338266bc9cfd4bbc18ec1be544a 2008-09-10 Christine Caulfield cman: honour the dirty flag on a node we haven't seen before The dity-node code used to check if a node had been down before honouring the dirty flag, this was to prevent nodes already in the cluster from kicking each other out at a transition. This had the problem that it could not detect if a new node joined that already had state (eg a new cluster started up in a split network). So, now we also check the 'first_trans' flag in the transition message so that we know when the node has newly joined a multi-node cluster. For more information see bz#460909 2008-09-10 Bob Peterson GFS2: sync buffers to disk when rewriting superblock bz 461290 GFS2: mount during fsck protections not working. When gfs2_fsck is run, it is supposed to rewrite the locking protocol in the superblock to "fsck_xxxx" (e.g. "fsck_dlm") to prevent all cluster nodes from mounting the file system while gfs2_fsck is running. The data was being written out, but the buffers were not synched to disk. That created a timing window where processes could still mount the file system. This was uncovered by the gfs2_fsck_stress test. This fix syncs the buffers to disk before continuing. Note that this still does not prevent users from running gfs2_fsck on file systems that are still mounted, but that is the way it has always been in the past. 2008-09-09 David Teigland gfs_controld: fix and implement remount Fixes problem with remount request mount.gfs was making; it wasn't specifying the locktable, causing a segfault. Implement the remount routines in the new cpg mode. 2008-09-08 David Teigland gfs_controld: ignore second leave When mount(2) fails we often get two leave requests. The second can be ignored, don't log an error message. mount.gfs: fix mount error handling When mount(2) fails, mount.gfs tells gfs_controld to leave the cpg. libgfscontrol requires the leave call to specify the locktable, which mount.gfs wasn't setting, causing it to segfault. 2008-09-08 Satoru SATOH fence: Add network interface select option for fence_xvmd 1. fence_xvmd selects wrong network interface to listen on if host has multiple interfaces and target interface is not for default route. As a result, fence_xvmd does not repond to fence_xvm's request. 2. fence_xvmd cannot start if default route is not set. Ex: fence_xvmd -I Signed-Off-By: Satoru SATOH Signed-Off-By: Lon Hohberger 2008-09-05 David Teigland fence_tool/dlm_tool/gfs_control: improve ls output format Improve the output formatting of 'ls' list operations to make the info easier to view. libdlm: /dev/misc/dlm-control created by udev Remove code that creates the dlm-control device. Depend on udev to create it like we do for other devices. dlm_controld/gfs_controld: handle merge of cpg partition Uncomment the code that handles stateful cpg's merging together. (Same as previous fenced commit.) Also make same changes to debug output as previous fenced commit. fenced: handle merge of cpg partition Uncomment the code that handles stateful cpg's merging together. The daemon ignores "start" messages from new (to it), stateful (non-zero started_count), cpg members. Because the ignored messages are used as a barrier, the ignored (disallowed) nodes must be removed (the nodes killed) for domain processing to continue. Also, set the cman dirty flag so cman will create disallowed nodes if stateful/dirty clusters merge. This is necessary to avoid skipping fencing of a node after it's merged (the is_cman_member checks). Also improvements to various log_debug statements. 2008-09-05 Christine Caulfield cman: Initialise variable Actually, I thought the loader was supposed to do this... 2008-09-03 David Teigland fenced: joining daemon cpg to bypass fencing When the fenced daemon starts, it checks for uncontrolled instances of gfs/dlm, and if none are found, it joins a special "daemon" cpg (not the fence domain cpg). This join simply tells fenced on other nodes that the new node is in a cleanly reset state and they can skip fencing it if it's currently a victim. Currently, fencing is skipped if the victim just joins the cluster, but this is not sufficient since a node can join the cluster with uncontrolled gfs/dlm instances (it still needs to be fenced). In cluster2, the groupd cpg filled the role of this new fenced cpg in advertising the clean/reset state of a node. 2008-09-03 Bob Peterson Changes needed to stay current with libvolume_id. Changes needed to stay compatible with libvolume_id. 2008-09-03 Christine Caulfield config: Remove stray fprintf Signed-off-by: Christine Caulfield config: fix ldap load bug caused by new objdb ordering in corosync The LDAP loader assumed the old obdb behaviour that objects entered into the appeared in reverse order. This is no longer true - they are in the same order they were created. So this fixes LDAP to match this assumption. cman: cope better with malformed config files A config file with would fool cman into not finding the nodename, even though it exists. This fixes that. cman & config: Move special cases out of config modules It was previously the job of the config modules to move items like , etc from under and into the root of the tree so that corosync could find them. This is obviously absurd. So I've made cman-preconfig do this now. This means that the config modules do only what they should do, read in the config tree as they see it into objdb. Any other new keys we need to add to /cluster that actually apply to corosync can be added in one place. 2008-09-03 Chris Feist fence: fixed a fence storm with fence_egenera 4.8 - bz#437867 4.7.z - bz#459501 Committed on behalf of Jim Parsons (jparsons@redhat.com) 2008-09-02 Lon Hohberger cman: Fix qdiskd file descriptor leak Patch from Sean E. Millichamp rhbz#460645 2008-09-01 Andrew Beekhof New vmware resource agent from Cristian Mammoli See dev list for legal disclaimer --HG-- extra : convert_revision : fc047640072cc2a4e6c7cff6d0137bb986145957 2008-08-29 Simon Horman Merge with upstream --HG-- extra : convert_revision : b23ba3cf9cac5bc13f0ca0d5544fab0fc34382e6 2008-08-28 Bob Peterson GFS2: Make gfs2_fsck accept UNLINKED metadata blocks bz 460327 Originally, GFS2 did not use a block type of 2 in the bitmaps, so it was considered invalid. However, GFS2 now uses that block type to indicate unlinked metadata blocks. This allows for cases where an inode is unlinked on one node while still open on another node. This fix changes gfs2_fsck so that it ignores these blocks (eventually the file system will reclaim them) rather than reporting them as errors. 2008-08-28 Christine Caulfield cman: Allow a recently left node to join cleanly. If a node leaves cleanly and then joins within the corosync totem timeout then odd things can happen and the nodeslist can get inconsistent. With the rest of the cluster stack on top this is probably quite hard to do I suspect. 2008-08-28 Fabio M. Di Nitto ccs: deal with xml file format special case when talking about specific xpath queries, writing: .. is not the same as writing: fix memory overflow in both ccsd and xmlconfig. It is impossible to exploit this overflow for anything useful since it's used at the very beginning of the startup process when literally nothing is running and it causes the XML parser to crash. ccsd did never show this issue because it was using a pre-allocated buffer and it was always big enough to hold the data (even when we were writing more than our calculated size). xml on the other side, always need to allocate. rgmanger: fix handling of VIP v6 when using ip to handle ipv6 address, we need to speficify netmask on add and remove operations, the same way is done in ipv4 code. Fix bugzilla: #459582 libdlm: major cleanup Move libaislock to contrib/ as it was not built within libdlm for ages. Simplify libdlm/Makefile (first pass) and adapt the build system to cope with %_lt.o. Fix make/libs.mk to deal with multiple objects linked in a shared library. misc: init scripts clean up after discussion on cluster-devel mailing list, each init script should include /etc/sysconfig/$init_script_name. change all init scripts to source /etc/sysconfig/cluster for backward compatibility and then source /etc/sysconfig/$init_script_name to give priority to proper config file. remove also warnings about obsolete scripts. ccs: libccsconfdb header cleanup remove unrequired headers and make sure to use only corosync includes to build. Nothing in libccs uses openais services. 2008-08-27 David Teigland init.d/cman: use fence_tool -m for two node clusters bz 460190 Use the new fence_tool -m option in the cman init script for two node clusters. This delays fence_tool join when both nodes aren't members. The delay allows initial cluster partitions (due to badly configured network/switches) to converge before starting fencing. fence_tool: new option to delay before join bz 460190 Certain network/switch settings cause nodes to form partitioned clusters when they start up. Add code to better cope with these initial partitions. The network partitions are a particular problem for two_node clusters where a node has quorum when it starts up on its own. This adds a new fence_tool option -m, e.g. fence_tool join -m . It causes fence_tool to delay the join by up to to allow all nodes in cluster.conf to become cluster members. This allows openais on the nodes to all see each other before starting the fence domain. So we join the domain *after* the nodes merge into a single cluster. If we joined the domain *before* the cluster partition merged, then nodes end up being fenced unnecessarily. (This is a similar idea to post_join_delay; a delay that gives us time to determine that a node in an unknown state is actually ok and doesn't require fencing.) groupd: fix daemon quit on SIGTERM The daemon can quit immediately on SIGTERM if running in LIBCPG mode, without checking for groups. (The placeholder groups for blocking old cluster2 groupd's get in the way of checking gd_groups.) 2008-08-27 Marek 'marx' Grac [FENCE] Fix #237266 - LPAR/HMC fence agent Minor fix (thanks to brking@us.ibm.com) in get_power_status(). If the state is not 'Running' then it is considered off, originally function returns undefined value in specific cases (like panic on machine - 'Error'). [FENCE] Fix #448043 - Update man pages for fence agents Manual pages for APC, BladeCenter, iLO and WTI where updated mostly with information about using ssh and passwords (-x, -z, -i, -S). 2008-08-27 Fabio M. Di Nitto config: make more functions static no need to export internal functions. build: fix clean target of contrib section make sure we can always clean when building outside the source tree. cman: init script best to require $time I was getting some weird OOPS when using dlm based applications on one of the nodes. After a long debugging session it turns out that the node had a clock skew of several minutes. Depend on $time to be executed before we start to soften the problem by syncing date on all nodes. 2008-08-26 Fabio M. Di Nitto build: bump library soname to 3.0 Our shared library API can now be considered stable. By no means the code is bug free or feature complete but it is one step forward towards 3.0 release. 2008-08-26 Christine Caulfield cman: return the correct length of a message When I converted to corosync somehow the length of the header didn't get subtracted from the total message length as it was passed down the stack. This fixes that. 2008-08-25 David Teigland gfs_controld: ignore dlm uevents Some people use "gfs" or "gfs2" as the name of the fs, which results in dlm lockspaces named the same, and we want to ignore dlm uevents for those. 2008-08-25 Lars Marowsky-Bree Upgrade version to 2.99.1. --HG-- extra : convert_revision : 38b62f2b4cb3d4bdabf5f3c65efa9ef958efce47 core: remove cl_malloc completely. --HG-- extra : convert_revision : 54e82784257d8d5c4c0f2267dc4a64c17f4c2497 2008-08-22 Lon Hohberger rgmanager: Ancillary fix for rhbz #453000 See: https://bugzilla.redhat.com/show_bug.cgi?id=453000#c6 2008-08-22 Fabio M. Di Nitto build: add --without_config build option disables the building of all config/ subsystem. at the same time remove some old references of without_ccs from top level configure build: rename --enable_xen to --enable_virt xvm supports all virtualization implementations that libvirt supports and those are more than just xen. 2008-08-21 David Teigland fenced: add skip_undefined option bz 459127 New fenced config option would cause fenced to not do startup fencing of nodes with zero defined fence methods. The primary use for this option would be asymmetric cluster configs (http://sources.redhat.com/cluster/wiki/asymmetric_cluster_config) where client/small/spectator nodes do not join the fence domain and have no fencing configured. The problem we have is that even with no fencing configured, and not joining the fence domain, other nodes may attempt (and fail) to fence these client nodes during startup fencing. 2008-08-21 Lars Marowsky-Bree Added changelog for 2.99.0, to be a beta release for 3.0.0 --HG-- extra : convert_revision : 020df9293b7c7c52d979131225f32e25a4cdd1ed 2008-08-21 Andrew Beekhof Merge with upstream --HG-- extra : convert_revision : a43958e72c689b478e242e25882a3fc37c1cb409 2008-08-20 David Teigland dlm_controld: isolate cman and fence code in member_cman.c, so that alternative code can be compiled instead. 2008-08-20 Andrew Beekhof Formatting in configure.in --HG-- extra : convert_revision : 138224137a3fc58c5fb9bd4c6ce7a806a219f9b1 2008-08-20 Lars Marowsky-Bree Disable building the quorum server by default, as it is impossible to deploy correctly. --HG-- extra : convert_revision : b32ca6086e32c64a81dadfd0eade2a15e685519e 2008-08-20 Fabio M. Di Nitto misc: remove exec bits from different files use python setup.py instead of invoking ./setup.py. build: plugin askant in our build system add contrib/askant/Makefile remove lib and inc sections from setup.py and drive them from askant/Makefile instead. build: add contrib/Makefile start plugging askant into contrib build: create contrib/ top level section now that the project is more open to the community, create a top level contrib section where to add community contribution. change build system to cope with it and make it disabled by default. add option --enable_contrib to top level configure move askant into contrib/ 2008-08-20 Andrew Price askant: Import askant into tree Askant is the beginnings of a file system performance analysis tool. This commit imports it into the cluster tree. Ref: https://bugzilla.redhat.com/show_bug.cgi?id=239656 2008-08-20 Christine Caulfield cman: Return quorum state in a STATECHANGE notification This should remove a potential race condition where quorum changes after the message is received. cman: add cman_tool -A to disable load of openais services I have moved the loading of openais services to the control of the openaisserviceenable configuration module (as it knows better which services are available). Loading of this module, and consequently the rest of the openais services is now controlled by the -A switch. openais services are enabled by default. -A disables them. 2008-08-19 David Teigland groupd: remove detection of uncontrolled kernel dlm and gfs since the dlm_controld and gfs_controld daemons now do that. fenced: kill the cluster on misbehaving nodes Kill cman on other nodes where the fenced process fails. gfs_controld: kill the cluster on misbehaving nodes Kill cman on other nodes where the gfs_controld process fails. Shutdown cman locally if we find uncontrolled filesystems at startup. dlm_controld: open dlm-monitor misc device if it exists, as it will in future kernels. This allows dlm-kernel to detect if dlm_controld fails, and stop lockspaces in response. Also rework/simplify handling of all misc devices; depend on udev to create the device nodes. Adds /dev/misc/dlm-monitor to the dlm's udev config file. 2008-08-19 Fabio M. Di Nitto qdisk: fix sysfs path diving 2008-08-19 Andrew Beekhof Clean up the remaining references to code deleted by lmb yesterday --HG-- extra : convert_revision : 29578cfbafba0928e5acdaeb9e6e5df5984ba494 Fix the libnet changes --HG-- extra : convert_revision : b2f8f1786ba883770ca31566c5bcd38459950003 IPaddr2 also needs to check if send_arp is available --HG-- extra : convert_revision : 262b7c4b2726f2378a25c411273ae0c26aae38e2 Merge with upstream --HG-- extra : convert_revision : 8572781681e125ead7cf2779c2fa6d5948883d36 Allow the project to build without libnet New option: --enable-libnet which defaults to 'try' If libnet is not found and we're not being built for linux, stop building unless --disable-libnet was specifically requested, in which case Ipaddr will not update ARP tables. Import the source code from arping as a send_arp replacement on linux when libnet isn't available Allow the code pieces of IPaddr to function if send_arp isn't available Moved send_arp.c to tools/send_arp.libnet.c --HG-- extra : convert_revision : 7ea7c05d3ed32508b3f529389bd40a977a0231d0 2008-08-18 Lars Marowsky-Bree Low: LF#1897: Filesystem/ocfs2: Do not change case for cluster name setting. --HG-- extra : convert_revision : 2b5231294f4f8e2d9fee01f9ecb6a09c4a51a6a6 Missed a ciblint reference. --HG-- extra : convert_revision : f762143a909d96ea97aefd876281fe7fa4ebd23e Remove mgmt; depends on pacemaker. --HG-- extra : convert_revision : cf17a3d4167bc3f085a5d81f8a90958aefc934b0 Delete SNMP subagent; depends on pacemaker, must be built outside. --HG-- extra : convert_revision : e491eaf38487f6e25d1c249ed85d675e9d7edadf Remove recoverymgrd; unmaintained. --HG-- extra : convert_revision : 0e7bbd4323b5893e68330a015b14dd128004dcc2 Removing CIM provider, now part of a different project, as it requires building against pacemaker. --HG-- extra : convert_revision : 54823f0e4ca8bd246ca688ac80681535674d5e08 Remove tsa plugin. --HG-- extra : convert_revision : 1c2dfab6ce4ee03f15b2f23cdd813635c5829411 2008-08-18 Simon Horman medium: remove snmp_subagent snmp_subagent now lives over in the pacemaker tree. It is plausiable that heartbeat specific version should live here, but until that time comes just remove it. --HG-- extra : convert_revision : baac6e0bdb06cfa6f3cfca7d6c156d5c40ade77d 2008-08-15 Fabio M. Di Nitto qdisk: allow scan of sysfs to dive into first level symlinks Some kernels populate /sys/block with symlinks when others don't. Allow sysfs to dive into symlinks at the top level to handle both. libccs: add support for /child::*[%d]/ for xpathlite xpathlite did not understand the concept of child::*[%d] within a path (ex: /cluster/rm/child::*[1]/@name). This operation is required by rgmanager to load the service tree. 2008-08-15 Lars Marowsky-Bree RA: Xen: Don't destroy stopped DomUs. (Redundant.) --HG-- extra : convert_revision : 6d6f1890b4df429d8a8abe5a47a6ab6da6847151 The unit is ms, not s. --HG-- extra : convert_revision : b9cef454b4d10427570db94b39493dfea068f33d RA: Xen: Spelling an environment variable correctly helps ;-) --HG-- extra : convert_revision : 315e02ec74f6a57aec4046a2dd7a85d0de8d90b4 RA: Xen: Make stop actually work if shutdown failed. The RA will now properly escalate to "xm destroy" if shutdown takes too long, or if the timeout is set to 0 directly. Also removes relying on "xm -w", which is apparently not available everywhere. The code now also has a simplified control flow. --HG-- extra : convert_revision : fabe589636c67e5b12e1e8d2df6ea2afb498fe5c 2008-08-14 David Teigland gfs_controld: fix fs_notify during recovery Wasn't processing mountgroup again after an initial failed fs_notify, so no retry was happening. Also, the process_mountgroup check was wrongly skipping the apply_changes phase sometimes at the very start because there was no change struct yet, which would cause a segfault. dlm_controld: fix nodeid in fs_result was copying the wrong value so nodeid returned was zero 2008-08-14 Lars Marowsky-Bree STONITH: New externel/xen0-ha plugin to handle virtualized clusters. See http://wiki.linux-ha.org/DomUClusters for details. --HG-- extra : convert_revision : d615bfbe489de6e24f467e314cbc5d0afe9631a2 2008-08-14 David Teigland dlm_controld: kill the cluster on misbehaving nodes Kill cman on other nodes where the dlm_controld process fails. Shutdown cman locally if we find uncontrolled lockspaces at startup. 2008-08-14 Fabio M. Di Nitto build: define legacy_code=1 on clean target make sure to always clean everything around even when switching to/from legacy_code ccs: move ccs/daemon to config/daemons/ccds and mark it legacy code remove support for --without_ccs. ccs is now fully obsoleted and you need to specify --enable_legacy_code to build it. adapt top level Makefile to deal with cluster/ccs removal. fix build dependencies around cman/lib that's now required before config: target. move ccsd man page in config/daemons/man. ccs: move ccsais plugin to config/plugins/ccsais and mark it legacy code ccs: move libccscompat into config/libs and mark it legacy code config: move generic documenation and man pages to config/man ccs: move comm_headers.h to ccs/daemon comm_headers.h defines the protocol to talk to ccsd. move it to daemon and fix inclusion paths for the users. ccs: move debug.h to ccs/daemon ccsd is the only user for debug.h. No need to keep it in a generic include dir. ccs: libccscompat don't include unrequired header libccscompat doesn't require debug.h cman: switch default config parser to xmlconfig config: move ccs/ccs_tool to config/tools/ccs_tool ccs_tool is now a generic tool for handling configuration. Move it to a proper location. cman: make ccsd startup optional and allow override of config loader implement support for CONFIG_LOADER envvar to select what config parsers should cman use. start ccsd only of we explicity select ccsconfig CONFIG_LOADER. perform basic sanity check for other loaders into init script. add some documentation on what loaders are available. 2008-08-14 Christine Caulfield cman: Silence some compiler warnings. This only show up with some compiler versions, but I don't like compiler warnings! cman" load openais services by default For compaibility with cluster2 we load a default set of openais services with cman/corosync. You can disable these by setting and adding your own services as needed. 2008-08-13 David Teigland dlm_controld: fs_register and fs_result fixes dlmc_fs_register now returns -EALREADY if the name is already registered. dlmc_fs_result now returns the nodeid in the field already provided. 2008-08-13 Christine Caulfield cman: fix objdb-destroying typo object_find_destroy is NOT spelled object-destroy! cman: Fix find_handle leak Signed-off-by: Christine Caulfield 2008-08-13 Andrew Beekhof Replace the horrendously backwards spec file with the one used by the build service The build service version is proven to work for 25+ distro/version combinations and does not require the project to have been ipreviously configured and built in order to be able to... configure and build the project. It also adopts a new package layout. --HG-- extra : convert_revision : b034c918cf1628cbe6a2338f3ee7d9d0fd4a86b8 2008-08-13 Fabio M. Di Nitto build: bump kernel requirement to 2.6.27 gfs-kernel module now needs 2.6.27. 2008-08-12 Abhijith Das gfs-kernel: bug 450209 - addendum to previous patch. Removes extraneous lock_dlm_plock.c I accidentally added this file to the patch I posted earlier to this bug. It's just an extra file at this point; Makefile doesn't even compile it. This commit removes it. gfs-kernel: Bug 450209: Create gfs1-specific lock modules + minor fixes to build with 2.6.27 gfs1 has its own lock modules now and is no longer dependent on gfs2.ko or lock_nolock.ko or lock_dlm.ko. This commit contains the lock modules patch and the following fixes to make gfs1 build with Steve's git tree. - change all instances of to - change all calls to permission() to inode_permission() - change all calls to remote_llseek() to generic_file_llseek_unlocked() I have been able to successfully compile with the patch against Steve's git tree, insmod the gfs.ko module, and mount a nolock filesystem using it. I don't have a cluster running upstream bits, so I couldn't test the module in a cluster. 2008-08-12 David Teigland gfs_controld: queries in libgroup mode Most of the query info doesn't apply when running in LIBGROUP mode, but some of the basic info can be provided. libdlm: handle truncated device names When lockspace names are over 15 characters long, they result in a device name that's over 19 characters long, e.g. dlm_0123456789ABCDEF. Sysfs truncates device names at 19 characters, so the device name for this lockspace is /sys/class/misc/dlm_0123456789ABCDE, which udev also uses to create the device node, /dev/misc/dlm_0123456789ABCDE. So, when libdlm waits for udev to create the device node, it needs to look for this truncated name. It then creates and removes symlinks with the full lockspace name. Joel Becker identified the problem and came up with this solution. 2008-08-12 Fabio M. Di Nitto build: add support for corosync rename aisexecbin to corosyncbin add options for corosyncincdir and corosynclibdir and propagate them across all Makefile 2008-08-12 Christine Caulfield cman (mainly): use corosync This patch changes cman to use corosync, the new split-up version of openais. It's mainly name changes, and changes to accommodate the new corosync API. It also changes the includes of other services from openais/ to corosync/ though I can't guarantee I've caught all of them. 2008-08-11 Fabio M. Di Nitto libccs: add support for /child::*[%d]/ for xpathlite xpathlite did not understand the concept of child::*[%d] within a path (ex: /cluster/rm/child::*[1]/@name). This operation is required by rgmanager to load the service tree. rgmanager: unbreak locking in clulib commit 1edb73bd098500d459c16797da2377a59f1ef180 introduced a set of checks for read/write operations. the error checks in cman.c where wrong and caused endless loops in rgmanager startup. fix those checks by making them "dumb" since we don't really care about the result of the operation directly, but other bits of code will take care of those. 2008-08-08 David Teigland dlm_controld: queries in libgroup mode Most of the query info doesn't apply when running in LIBGROUP mode, but some of the basic info can be provided. fenced: finishing off query stuff various loose ends group_tool: use mode from groupd to determine whether it should query groupd (old mode) or query the individual daemons (new mode). 2008-08-08 Bob Peterson mkfs.gfs2: should have an optional fs size parm bz 450764 This patch fixes two problems with the previous patch for an optional "blocks" parameter to mkfs.gfs2. The bugs are: (1) If the blocks specified is too big, it gave a wrong message. This fixes the message and prints out what the numbers are (to help the user get it right). (2) If the number of blocks was specified, the device size was reported incorrectly. 2008-08-08 David Teigland dlm_tool: handle all join flags Add the ability to set all join flags, and print the flags being used when dlm_tool join is run. libdlm: remove device node creation/removal It's unnecessary as udev has been doing it for quite a long time. (Doing this ourselves also won't work correctly in the future with a pending dlm patch that will allow multiple creates and only the final release actually releases the lockspace.) 2008-08-08 Lon Hohberger [rgmanager] Re-fix permissions bits broken in last commit Remove execute flag from file permissions. [rgmanager] Fix resource agent metadata and un-break 'make check' target 2008-08-08 Fabio M. Di Nitto config: fix objdb2xml filtering ais/corosync has a set of extra information on the toplevel tree that includes cman: make sure not to umount configfs when there are other users Fix bugzilla: #457991 build: fix ccs_test symlink install target previous target was copying the whole ccs_tool into ccs_test again. 2008-08-07 Simon Horman Low: Remove bashisms from character classes It seems that ! rather than ^ is the negation opperator in shell character clases See Ubunty bug 248737 https://bugs.launchpad.net/ubuntu/+source/heartbeat/+bug/248737 --HG-- extra : convert_revision : 5aea0e314a77c15437765593a7b4f3f13c829f83 2008-08-07 Fabio M. Di Nitto ccs: turn more ccs_tool code into legacy code the update functionality connects to ccsd that is now legacy code. filter this out when build normal ccs_tool. 2008-08-07 Simon Horman Merge with upstream --HG-- extra : convert_revision : cd3a345f81fa91ff1c6696ab2a4cb6df58853d18 Low: Cope with empty $RANDOM In some environments, such as dash, $RANDOM is empty. This fixes up the last two references that I could find in the tree to $RANDOM to make sure they are sensible in this case. --HG-- extra : convert_revision : 597cfd99deee5128d77022c53112dfc1f8686889 Medium: safely create tempoary files After my previous post, I realised that there are lots of other places in the code where mktemp is used and more importantly, alternate unsafe code is used. So I have had a crack at cleaning things up. ------------------------------------------------------------------------ There are currently at least two problems with maketempfile. Firstly, there is a race in the following constrct: rm -f "$F"; touch "$F" As an attacker could potitinally create a symlink to "$F" between the call to rm and the call to touch. Secondly the use of $RANDOM appears to be a bashism. On dash its usage in BasicSanityCheck appears to evaluate to the empty string. See Debian Bug #489607, http://bugs.debian.org/489607 This patch takes the approach of using mkdir, which is atomic, to create a safe place to store the logfile the tempoary directory that is used by the filesystem check. The patch also makes sure that the return value is checked and the script exits cleanly if an error occurs. This code is always used, there seems no point is providing a robust fallback to mktemp - if its robust, it can be used :-) For a discussion of creating tempoary files in shell see http://www.linuxsecurity.com/content/view/115462/81/ --HG-- extra : convert_revision : 82e3625e0198af52eb22ef8a42cebf5050ab69e3 2008-08-06 David Teigland gfs_controld: register with dlm_controld earlier dlm_controld now allows us to register interest in a lockspace before the lockspace is created, so do it right away instead of after mount(2) completes, which has the potential of being too late. dlm_controld: allow early fs_register fs_controld daemons need to register their interest in a lockspace during startup before they've actually joined the lockspace, which means we need to record the registration before the lockspace exists and then set the registered flag when it's created. 2008-08-06 Andrew Beekhof IPaddr2: Use a better name for the unique clone address functionality --HG-- extra : convert_revision : 5db067880840a656e036b434649027c8a0c956d7 Hg: Merge in some ipaddr changes --HG-- extra : convert_revision : f662a7b6cc953dbd0d813420d3b90289b60193e6 IPaddr2: Allow IPaddr2 to be cloned in such a way such that each instance gets a unique address --HG-- extra : convert_revision : f2ff2eff826b048b2dc6717a7b16e893bd76075e 2008-08-05 Fabio M. Di Nitto build: properly respect non standard libdir and incdir We allow users to set libdir and incdir to non standard locations. Those values need to be propagated properly within the build system to have a higher priority than system locations but lower than specific paths. qdisk: port to new logsys api switch to new logsys api and fix a bug where debug output was always enabled. 2008-08-04 Fabio M. Di Nitto ccs: move to the new logsys init API Change ccs daemon to use the new logsys API. 2008-08-04 Lars Marowsky-Bree RA: Xen: use quotation for clarity. --HG-- extra : convert_revision : 764c66fc6e12849eb018944805fe93877e3b5bd4 RA: Xen: Unify formating of if clauses. --HG-- extra : convert_revision : 39918a103caef701fc3c6b07fa71a84e2b656015 RA: Xen: Fix handling of installation errors. Handle the absence of the "xm" command. Improve handling of the configuration file not being present at start time. --HG-- extra : convert_revision : fc909b18b81ac94a18e8a40d770e1ac97649716e 2008-08-04 Christine Caulfield cman: tidy objdb_get_int Make objdb_get_int take a default value tat gets filled in then there is no value in the objdb. This tidies the situation hugely as well as fixing bugs that arose because of the ambiguous nature of the function's returned values. 2008-08-04 Fabio M. Di Nitto build: drop "all" dependency from install: targets this change should address the last problem reported by Joel. build: fix several issues related to install and build targets Several users have been reporting issues building and installing our source from an nfs mount. There are two related issues in this scenario (P = Problem): P1) nfs servers often drops root privileges to nobody (very often required for install:) P2) our build system had issues in the past with linking against static libraries. the combination of the two ends with a user "nobody" trying to link one of our tools at install time and of course it fails because the nfs server would refuse "nobody" to access a directory/file that the user owns. We address this issue by doing a set of simple changes around (A = Action): A1) Remove all PHONY targets so that at install time we do not re-link bits that are already updated. A2) Introduce a more detailed dependecy tracker for static libraries by using LDDEPS var within the affected Makefiles. This replaces the requirement of PHONY targets in A1). Note: changes to header files are already tracked by .d files, LDDEPS tracks changes to the binaries. A3) Make sure that all target that requires static libs are updated to use LDDEPS. Extra benefits from this change (B = Benefit): B1) no relink at install time. B2) install target does not require any devel library installed on the target system because of B1 B3) install target can be executed in parallel from several machines. For example when installing on N machines from the same nfs share. This was an issue before because of a race condition now fixed by B1. (two or more machine could try to link the same binary at the same time and fail) The only con is that A2 requires a bit more manual work on tracking linking against static libraries, but the changes in that area are not frequent enough to hold this fix. 2008-08-01 Ryan O'Hara ip.sh: add sleeptime parameter Allow user to specify amount of time to sleep after removing and IP address ('stop' command). The sleeptime parameter is specified in number of seconds. Default is 10 seconds. Setting sleeptime to zero will result in no sleep. 2008-08-01 Fabio M. Di Nitto fence: remove unrequired headers from rackswitch also fix a fauilure to build with 2.6.27 kernel headers 2008-08-01 Christine Caulfield cman: exit if configuration check fails. The error return from read_cman_config wasn't being checked, so configuration errors could cause segfaults later on in startup. cman: Revert dirty patch Revert patch 288ab73e51f51ce174f51dc2fc67c6dd1fe03e85 as it's dangerously wrong and based on several misunderstandings. see bz#443358 for mroe information 2008-07-31 David Teigland fence_tool: add domain member checks using libfenced library. Also clean up the code and logic for waiting and timeouts. 2008-07-31 Christine Caulfield qdisk: fix compile error when building without debug. 2008-07-31 Fabio M. Di Nitto bindings: fix CCS.pm doc purely cosmetic change to doc file. build: fix bindings build when using external object tree Stop symlinking files that are generated on the fly in objdir. Propagate Makefile.bindings into objdir. 2008-07-31 Bob Peterson gfs2_edit: Improved gfs journal dumps bz 450004 This patch adds three important improvements to journal dumps: 1. It now recognizes and dumps GFS1 log descriptor continuation blocks and dumps them correctly. 2. It now prints a marker where the journal wrapped so you can locate the most recently added entries easily. 3. The absolute block number of the journal block is now printed with the entries (in addition to the journal offset). This makes it easier to find the correct journal block you need to see. 2008-07-30 David Teigland gfs_controld: use group_mode detection The new default is to use group mode detection from groupd. This is -g2 (command line) or (cluster.conf). dlm_controld: use group_mode detection The new default is to use group mode detection from groupd. This is -g2 (command line) or (cluster.conf). fenced: use group_mode detection The new default is to use group mode detection from groupd. This is -g2 (command line) or (cluster.conf). groupd: detect group_mode command line -g2, or cluster.conf If old cluster2/RHEL5 nodes are in the cluster, cluster3 nodes will adopt the groupd compatibility mode to interoperate with them (this is the compat mode you get directly with -g1 or groupd_compat="1"). If no cluster2/RHEL5 nodes are in the cluster, cluster3 nodes will use the new group mode that doesn't go through libgroup, and is not compatible with cluster2/RHEL5 nodes (this is the non-compat mode you get directly with -g0 or groupd_compat="0"). This new mode (groupd_compat="2") is the default to favor the case of rolling cluster2->cluster3 upgrades, where cluster2 nodes and cluster3 nodes need to interoperate in a single cluster for a limited time. After all cluster2 nodes have been upgraded to cluster3, groupd_compat="0" can be added to cluster.conf the next time the entire cluster is taken down. It is still best to set groupd_compat="0" in cluster.conf for: . new clusters that don't require compatibility with cluster2 nodes . old clusters that are taken offline while the nodes are all upgraded from cluster2 to cluster3 2008-07-30 Dejan Muhamedagic Low: RA apache: exit with on unrecognized actions --HG-- extra : convert_revision : aaba921b100c0219aef0bf959851b21ea54afb00 2008-07-30 Fabio M. Di Nitto init: standardize init scripts to /etc/sysconfig/cluster Allow users to set everything from one single external config file rather than several. Retain backward compatibility by sourcing the old files and warning about the change. If a config option is present in both old and new file, the value from the new file will be used. misc: clean up "char const *" vs "const char *" this is cosmetic from a C point of view, but the perl xs preprocessor complains about it. while at it, clean up the typemap file that can be empty now. build: clean up perl bindings build system create a generic make/binding-passthrough.mk to use Makefile.bindings because the final Makefile is create by perl create a generic make/perl-binding-common.mk to include basically all targets that can be shared by different bindings simplify bindings/perl/Makefile to use only binding-passthrough.mk create bindings/perl/ccs/Makefile.bindings to use bindings infrastructure generate CCS.pm and META.yml from respective .in files to change data at build time (for example release version) 2008-07-29 Fabio M. Di Nitto bindings: improve Cluster::CCS description improve Cluster::CCS description bindings: add first cut of perl Cluster:CCS Add new perl binding for libccs. This is a first version and while the code works fine, the build system and many other details are still not clean. Revert "test commit" This reverts commit 900bca2d2bc1f03d22623ea5cbc4329d7f0799b9. test commit test commit 2008-07-29 Christine Caulfield [CMAN] Display the node's votes in cman_tool status I'm not sure how this one slipped the net. [CMAN] pass COROSYNC_ env variables to the daemon cman_tool makes its own environment for the aisexec daemon to use but some newer config modules like to have configuring environment variables. So now cman_tool passes down all environment variables starting with COROSYNC_ down. To make this work, I've also changed the plugins that currently take environment variables to name them COROSYNC_. Yes, I know we're not using corosync yet, but this is just one less thing we;ll need to remember to change next week or whenever. 2008-07-29 Fabio M. Di Nitto config: allow users to override default config file in xmlconfig By setting CLUSTER_CONFIG_FILE either in the enviroment it is possible to use an alternate cluster.conf. 2008-07-28 Ryan O'Hara cman: fix typo (#!/bin/bash) from previous commit Line 0 of the cman init script was #!/BIN/BAsh? cman: add option to init script to prevent joining the fence domain New sysconfig variable FENCE_JOIN will control whether or not the node joins the fence domain. This variable can be set to either value "yes" or "no". When FENCE_JOIN is set to "no", the init script will not attempt to join the fence domain. Any other value is equivalant to "yes", in which case the init script will attempt to join the fence domain. (BZ #455598) 2008-07-28 Christine Caulfield [CMAN] Fix overridden node names cman-tool join -n was failing because it assumed an old object context. It now uses the nodeslist functions I wrote for the purpose. 2008-07-28 Fabio M. Di Nitto rgmanager: init script does not need network config rgmanager init LSB header already express the requirement for network via Required-Start: cman that in chain Required-Start: $network. Lack of network will not start cman and as consequence rgmanager. cman: init script should not user cluster.conf directly fence_xvmd function needs to query the config to know if the daemon should start or not. switch an xmllint call to a ccs_tool query and remove hardcoded dependency on cluster.conf 2008-07-25 Fabio M. Di Nitto rgmanager: fix clean target stub clean target in init script Makefile 2008-07-25 Ryan O'Hara gfs_mkfs: change the way we check to see if a device is mounted New method will attempt to open device with O_EXCL flag. If open() responds with EBUSY, mkfs will die. This indicates that the device is either mounted or part of a LVM volume. (BZ #426298) 2008-07-25 Bob Peterson gfs2_edit: was parsing out gfs1 log descriptors improperly GFS2 log descriptors have 8 bytes per entry which represents a block number. GFS1 log descriptors have a small structure that is 16-bytes, the first 8 bytes of which is the block. So gfs2_edit was mistaking the extra 0x00000000 for a GFS2 end-of-block marker when it shouldn't have. This fixes it. gfs2_edit: Ability to enter "journalX" in block number. With gfs2_edit, you can position to the "block #" field and press , then enter a block number to jump to. You may also enter a keyword, like "root" to jump to the root directory, "rindex" to jump to the rindex, etc. Most keywords worked, but you could not enter "journal0" to jump to the first journal until this fix made it possible. 2008-07-25 Fabio M. Di Nitto fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement The new ccs_tool query is able to answer to the queries from fence_scsi*. Drop the need of XML::LibXML. Use ccs_tool query directly. Abstract from the need of parsing cluster.conf that is not the authoritative config file anymore. Drop unrequire check_config_nodes that's already embedded in the need of cman to run this code. fence: simplify init script fence_scsi needs cman to run. Verify that cman is running as first thing, before querying for config information that are now stored in cman/aisexec. cman already guarantees that all nodes have a nodeid. Drop this redundant check. rgmanger: remove check on cluster.conf from rgmanager init script cluster.conf is not the only authoritative configuration file anymore. LDAP and others could be in place. Remove the check and allow rgmanager to start. Revert "rgmanger: remove check on cluster.conf from rgmanager init script" This reverts commit 691c72052655c0f7c8142c35145237e122ae6b86. Revert "fence: simplify init script" This reverts commit 6a0647657348dd732615b7a0b7d6aad89c85b93a. Revert "fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement" This reverts commit e968098fa53e14d1cc9e60c42a4102674ecb51d2. 2008-07-25 Andrew Price fence: port scsi agent to use ccs_tool query and drop XML::LibXML requirement The new ccs_tool query is able to answer to the queries from fence_scsi*. Drop the need of XML::LibXML. Use ccs_tool query directly. Abstract from the need of parsing cluster.conf that is not the authoritative config file anymore. Drop unrequire check_config_nodes that's already embedded in the need of cman to run this code. fence: simplify init script fence_scsi needs cman to run. Verify that cman is running as first thing, before querying for config information that are now stored in cman/aisexec. cman already guarantees that all nodes have a nodeid. Drop this redundant check. rgmanger: remove check on cluster.conf from rgmanager init script cluster.conf is not the only authoritative configuration file anymore. LDAP and others could be in place. Remove the check and allow rgmanager to start. 2008-07-24 Lon Hohberger [qdisk] Make stop_cman="1" work if heuristics fail during initialization Bugzilla #455865 2008-07-24 Andrew Price [GFS2] libgfs2: Build with -fPIC Build libgfs2 with -fPIC to enable linking on x86_64 and others. 2008-07-24 Bob Peterson gfs2_fsck dupl. blocks between EA and data RGRepair: Account for RG blocks inside journals Better error reporting in gfs2_fsck Shrink memory 3: smaller link counts in inode_info Shrink memory 2: get rid of 3 huge in-core bitmaps Shrink memory 1: eliminate b_size from pseudo-buffer-heads Deleted unused function print_map Fix some bad references to gfs_tool and gfs_fsck gfs_fsck crosswrite for block number sanity checking Speed up userspace bitmap manipulation code. 2008-07-24 Fabio M. Di Nitto build: update .gitignore Add some more files we can ignore 2008-07-23 Fabio M. Di Nitto [BUILD] Cleanup groupd makefile 2008-07-23 Ryan McCabe fence: update apc snmp agent Pushing update to fix bz447414 for jparsons 2008-07-22 David Teigland groupd: use logsys Add logsys usage and configuration, following fenced pattern. 2008-07-22 Lon Hohberger [rgmanager] Add optional save/restore to vm resource This patch adds optional save/restore support to virtual machines. Patch Federico Simoncelli federico dot simoncelli at gmail dot com 2008-07-22 David Teigland dlm_controld/gfs_controld: add logging.c file which I forgot to add in previous commits. 2008-07-22 Fabio M. Di Nitto [BUILD] Cleanup linking order for logsys Collect all openais libs into their own LDFLAGS entry Linking to logsys requires pthread [FENCE] Fix fence_apc_snmp logging Move log file together with all the others when invoked in verbose mode. [BUILD] Fix LOGDIR usage LOGDIR was duplicate of logdir. Remove the duplication and use logdir as it should be. Also add logdir info to fencebuild, required for fence_apc_snmp [FENCE] Sync fence_apc_snmp from RHEL47 branch 2008-07-22 Bob Peterson Print log header flags for gfs journals. 2008-07-21 David Teigland gfs_controld: use logsys Add logsys usage and configuration, following fenced pattern. dlm_controld: use logsys Add logsys usage and configuration, following fenced pattern. fenced: munge logging to prepare for copying it to the other daemons. 2008-07-18 David Teigland fenced: complete messages copy start messages use the same formats and functions for handling "complete" messages dlm_controld: improved start messages Copy the improved start/sync message formats from fenced/gfs_controld, which allows the messages to be extended in the future without breaking compatibility. 2008-07-18 Andrew Beekhof Merge with upstream --HG-- extra : convert_revision : 523e0d5629c36b75012a41ad1288a14c4041fcbe 2008-07-18 David Teigland fenced: debug logsys options Enable/disable logsys debugging in the following ways, in order of priority: . command line -L [0|1] . environment variable FENCED_DEBUG_LOGSYS [0|1] . cluster.conf logging/logger_subsys/debug [off|on] . cluster.conf logging/debug [off|on] fenced: munge config option code to match the code in other daemons. fenced: improved start messages Copy the improved start/sync message formats back from gfs_controld, which also makes it simple to sync fencing history to new nodes. gfs_controld: close dlm_controld connection when we get a poll error from it 2008-07-18 Andrew Beekhof Indicate the type of allocation being used - nicer output --HG-- extra : convert_revision : 95b9cf7578f451b1509cab6cf93798d499bec8e2 Indicate the type of allocation being used --HG-- extra : convert_revision : 813b5c0be196b123abc45721fb17fe27fa42f95d 2008-07-18 Fabio M. Di Nitto [BUILD] Fix logrotate snippet filename [BUILD] Fix ccs_tool linking dir order 2008-07-17 David Teigland gfs_controld: byte swap ids earlier before they are used in match_change. 2008-07-17 Fabio M. Di Nitto [RGMANGER] Fix call to ccs_tool [RGMANAGER] Port ccs_get to proper ccs_tool output Revert "[RGMANAGER] Use proper ccs_tool query output" This reverts commit 2eaca4f9e31409f110ae318d1f192673dd744e33. [CCS] Inflict hopefully last compat issues love to ccs_t* [BUILD] Fix ccs_tool/ccs_test build with new compat code [RGMANAGER] Use proper ccs_tool query output 2008-07-17 Christine Caulfield [CCS] Make ccs_tool/ccs_test more consistent After much chatter, these tools are more consistent. ccs_test exists for old and new code, and always returns the old format output. ccs_tool query always returns new format output and the -c flag has been removed. 2008-07-17 Fabio M. Di Nitto [BUILD] Fix race condition in oldconfig update/execution 2008-07-17 Christine Caulfield [CCS] Set return status on failure [CONFIG] Add some more errnos to libccsconfdb 2008-07-17 Fabio M. Di Nitto [RGMANAGER] Port smb resource agent to ccs_tool Fix also a wrong arg in get_service_ip_keys in config-utils.sh.in 2008-07-16 David Teigland gfs_controld: add missing endian conversion for id_info structs in start messages gfs_controld: change start message from new members Have new nodes include an id struct for all members so we have the extra verification step when matching start messages to changes. 2008-07-16 Fabio M. Di Nitto [RGMANAGER] Port all resource agents to new ccs interface [CCS] Kill obsolted ccs_test 2008-07-16 Christine Caulfield [CCS] add -c flag to ccs_tool query The -c (compat) flag displays the query results in the same format emitted by the old "ccs_test" tool. 2008-07-16 Fabio M. Di Nitto [BUILD] Fix doc install target when building objects outside source tree [BUILD] Fix ccs.h include path 2008-07-16 Keisuke MORI build: configure subdirectories to prevent an error on building RPMs --HG-- extra : convert_revision : 5cc4328d1ccdcba47605295e8cc8381457d18087 2008-07-16 Fabio M. Di Nitto [BUILD] Add ccs_test replacement when building legacy_code [BUILD] Implement --enable_legacy_code in the build system [CCS] Fix LEGACY_CODE ifdef 2008-07-16 Christine Caulfield [CCS] Fold ccs_test into ccs_tool and tidy Move all the code from ccs_test into ccs_tool and make it switch depending on argv[0]. Also add ccs_tool query that does full XPath queries on confdb. When build with LEGACY_CODE set ccs_tool will behave as it used to, talking to ccsd. Without that it will use the new confdb system and assume that ccsd is NOT used for distribution of cluster configurations. 2008-07-15 Christine Caulfield [CMAN] Don't use logsys in config modules. Revert "[CMAN] Don't use logsys in config modules." I'm not sure how those deletes got in there, sorry This reverts commit 41e7b5a1db20a86d6f5306afeab89944f44bc7b2. [CMAN] Don't use logsys in config modules. I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules I must not use libraries in config modules [CCS] Set errno when an error occurs. Set errno to the error code to be consistent with other libraries. 2008-07-15 Florian Haas Low: RA Route: new OCF RA to manage IP routes --HG-- extra : convert_revision : 9ac97992de2ceefcc18727a6531e2b4a9a0a8ba4 2008-07-14 David Teigland dlm_controld: set id before recovery The lockspace id needs to be set in the kernel before starting recovery in the kernel, which depends on a valid id. fenced: fix logsys define name is LOG_MODE_FILTER_DEBUG_FROM_SYSLOG fenced: enable new logsys mode flag since it's been commited to openais svn 2008-07-14 Dejan Muhamedagic Medium: RA apache: fix exit code in case the binary is missing (thanks to Dominik Klein) --HG-- extra : convert_revision : 6c63b581d341e98bfa968e2377df945abdb49a5d 2008-07-14 Christine Caulfield [CMAN] Remove some spurious prints They were left in after debugging. 2008-07-14 Fabio M. Di Nitto [MISC] Fix build with newer toolchain 2008-07-14 Fabio M. Di Nitto [BUILD] Clean extra kernel modules files 2008-07-11 David Teigland groupd: sync daemon setup/structure with others This daemon is defunct, but will continue to exist for back compat cases, so put code in sync with other daemons. fenced: tune logsys settings put errors in fenced.log (in addition to syslog) by default keep debug out of syslog by default (commented out pending openais commit) 2008-07-11 Fabio M. Di Nitto [MISC] Create and install logrotate file With daemons writing their own log files, we want to rotate them properly. Add logrotate snippet in doc/ for who wants to do it manually Install it in logrotate dir (you will need to rerun configure for this to be set properly) Adapt build system to propagate info around 2008-07-10 David Teigland fenced/dlm_controld: fix quorum waiting Fix how fenced and dlm_controld check for quorum, and how they verify that quorum is adjusted for all the necessary failures. Fix how dlm_controld creates configfs entries for existing cman members. fenced/dlm_controld/gfs_controld: ccs/cman setup Consistently set up and clean up ccs and cman. 2008-07-10 Lars Marowsky-Bree Automated merge with ssh://hg.linux-ha.org/dev --HG-- extra : convert_revision : 0e5830a61a3e9c31516aaf1bad83459a1e7bdbe8 RA: o2cb: Add hint that it should not be used. --HG-- extra : convert_revision : ba3ce239ebfa404730ffc8535f1cb476bf2c0139 2008-07-10 Christine Caulfield [CONFIG] Add a man page for confdb2ldif and make the program do as the man page says. 2008-07-10 Fabio M. Di Nitto [BUILD] Plug confdb to ldap tool 2008-07-10 Christine Caulfield [CONFIG] rename ldap config generator Rename loadldap to confdb2ldif. This more accurately expresses what the program does. I've also added a comment to the generated ldif file. 2008-07-10 Fabio M. Di Nitto [CONFIG] Fix loadldap include 2008-07-09 David Teigland gfs_controld: add journal for new node Wasn't adding a journal struct for a new node. gfs_controld: add query code Also let group_tool query new daemons. 2008-07-09 Christine Caulfield [CONFIG] Add ldap loader Add a new tool to load an existing cluster config into LDAP. This is as incomplete as the schema at the moment, but it will allow you to add nodes and some very basic fencing devices. It uses libconfdb so it can migrate either a running system or a cluster.conf file. 2008-07-07 David Teigland gfs_controld: support queries from gfs_control through libgfscontrol fenced: link with liblogsys Makefile was missing -llogsys and had duplicate -lcpg fenced/fence_node: use SYSLOGLEVEL cluster config setting instead of logsys definitions. 2008-07-07 Christine Caulfield [CONFIG] Add some more ldap comments Show what sort of cluster.conf file, the ldap example maps onto. Fix some schema errors. 2008-07-04 Fabio M. Di Nitto [MISC] Fix logging file query This fix the config key from filename to logfile as it is supposed to be. I misread the openais documentation on this specific bit and ended using filename instead of logfile. 2008-07-04 Christine Caulfield [CMAN] Remove some redundant code. 2008-07-03 David Teigland fence_node: use simple logsys api instead of the macros. fenced: use logsys - Setup ccs connection once at the start and keep it open. - Read logging configuration from ccs. - Replace calls to syslog with calls to logsys. - Direct debug statements to logsys. - cman setup uses cman_is_active - cman setup retries cman_init and cman_is_active 2008-07-03 Christine Caulfield [CMAN] Fix logging options Fix up the logging for the cman plugin so that it consistently honours both settings in cluster.conf and 'cman_tool join -d'. This has exposed a couple of bugs in openais which should be addressed shortly. 2008-07-03 Fabio M. Di Nitto [MISC] Update .gitignore [BUILD] Add make oldconfig target configure invokation now creates a .configure.sh script file at the top level of the tree. It contains a shell script to reproduce the last invokation of configure and it allows to add extra config parameters. Note that .configure.sh is updated each time you invoke configure. Add also a make oldconfig that will execute the script. [MISC] Use default configured SYSLOGLEVEL across the tree [BUILD] Allow users to configure default built-in syslog level [BUILD] Fix telnet_ssl build [BUILD] Fix install of telnet_ssl 2008-07-02 Lon Hohberger [fence] Make fence_xvm[d] use normal log levels It used LOG_NOTICE for info-level and LOG_INFO for debug-level notices due to noise from logsys which has now been resolved in the trunk of openais. 2008-07-02 Marek 'marx' Grac [FENCE] Bug #448822: fence_ilo doesn't work with iLO New fencing agent for iLO used ssh/telnet to connect, but unfortutely there is a problem with power off. This is why we need to use SSL connection and RIBCL commands. As there is no (?) telnet with ssl connection in RHEL we need one to be able to use same infrastructure as in other agents. This agent was not tested with RIBCL version < 2.0 (these part where just ported from the old perl fencing agent) @todo: we have to put telnet_ssl.py somewhere, I'm not sure where 2008-07-02 Christine Caulfield [CMAN] Only do timestamp check for older nodes. 6.1.0 nodes set the dirty flag, so there's no point in checking the join timestamp as it just propogates the potential problems that the dirty flag was meant to fix. 2008-07-01 Christine Caulfield [CMAN] Add a config update callback 2008-07-01 Simon Horman Remove some Bashisms from the OCF Xen and Filesystem resources Fixes Debian Bug #487167 See: http://bugs.debian.org/487167 Thanks to Luca Falavigna --HG-- extra : convert_revision : 055ffbfe69b2b3810f9c71c491c30684fd67b669 2008-07-01 Fabio M. Di Nitto [MISC] Documentation cleanup Move all licence and copyright bits into doc/ Add doc/Makefile and use DOCS= infrastructure to install all documentation. [BUILD] Install ldap schemas and example in document directory [BUILD] Add install/uninstall snippets for documents [BUILD] Fix docdir default path [BUILD] Allow configuration of docdir 2008-06-27 Lon Hohberger [fence] Fix XVM's debug.c default [fence] Port XVM to logsys * For full debugging, * Intentionally does not use LOG_DEBUG (yet) because of the amount of noise generated in doing so; but full "debugging" is available: * use debug="10" in cluster.conf, or * -dddddddddd on the command line. 2008-06-27 Christine Caulfield [CONFIG] Improve LDAP error reporting and tidy up the include list too. [CONFIG] Make ldap put totem in the right place This is a bit of a hack for the moment, but it puts the totem and logging keys back into the right place in objdb. It also tidies the code a little, ready for later work. 2008-06-27 Benjamin Marzinski [gnbd-kernel] bz 442606: Switch gnbd to use deadline scheduler by default. GNBD was hanging under load with O_DIRECT. GNBD needs to block in its request function. This causes some problems with the anticipatory scheduler, which GNBD was using by default. To avoid these, this fix makes gnbd use the deadline scheduler by default. 2008-06-27 Fabio M. Di Nitto [BUILD] Add configure options for libldap required by config/plugins/ldap/ 2008-06-26 David Teigland fenced: revert logsys commits [FENCE] Start porting fenced to logsys cf4c7ebac813b0b607acf6cf74bbdddfc8cfb12a [FENCE] Make fenced ready to load logsys config c54c56c5a09f98547ceda3bc5fa9afa28b354480 [FENCE] Move logsys configuration calls where they belong 18e085596bb8844f74689a92662f2e5e9166836b [FENCE] Allow fenced to configure logsys da704715c606c9c01637ae53d79f8dec6a8b0389 [FENCE] fenced: separate concept of fork and debugging 95a5c6b13294742956b13070ebc4f4513278255f 2008-06-26 Dejan Muhamedagic Low: Xen RA: fix severity for one log message (thanks to Florian Haas) --HG-- extra : convert_revision : c1fc38bbbaa17063a977a65c80f4752b1479b530 2008-06-26 Lon Hohberger [rgmanager] Fix erroneous broadcast matching in ip.sh * This fixes an issue where rgmanager removes the wrong address because the IP matches the broadcast address of the interface. * Red Hat Bugzilla #453000 2008-06-26 Christine Caulfield [CONFIG] Add ldap configurator This is an openais configuration plugin to read the cluster config from an LDAP server. A schema file is included that provides just enough information to get a cluster running, more will follow. There is also an example ldif file to show how to load the information into the database. The defaults are slightly odd at the moment, I'll fix those as it develops and document how to override them. In the mean time see the top of the source code. 2008-06-26 Lon Hohberger [fence] fence_xvmd: Add KVM support; misc cleanups. * WARNING WARNING WARNING: Changes the default URI to KVM. You must specify uri="xen:///" in cluster.conf to use Xen now (or -U xen:/// on the command line) * fence_xvmd -h now displays appropriate cluster.conf related help information. * This commit simply fixes authorship of the previous patch; apparently git lets you commit as root even if a different user checked it out... Revert "[fence] fence_xvmd: Add KVM support; misc cleanups." This reverts commit beeb2070953548ecf38751294e5371a668f73ee2. 2008-06-26 root [fence] fence_xvmd: Add KVM support; misc cleanups. * WARNING WARNING WARNING: Changes the default URI to KVM. You must specify uri="xen:///" in cluster.conf to use Xen now (or -U xen:/// on the command line) * fence_xvmd -h now displays appropriate cluster.conf related help information. 2008-06-26 Serge Low: pgsql RA: check for the non-supported action --HG-- extra : convert_revision : 1cbc4c8c571f74ce85d0fbd04a6198d2dae8fb5d 2008-06-25 Fabio M. Di Nitto [CONFIG] Fix several bugs in XML parsing implementations [CONFIG] Add cluster.conf direct loader This pluing allows to load cluster.conf directly into the objdb for configuration. With this plugin there is no need to have ccsd running. To use: make sure that all nodes have the same cluster.conf form the cluster with: cman_tool -C xmlconfig join [CONFIG] Make sure to reset xml index in not in list mode [CCS] Remove duplicate header [FENCE] fence_tool: document "ls" [GFS2] hexedit does not need syslog [FENCE] fenced: update man page [CMAN] Remove unrequired includes [FENCE] fence_node: use logsys for logging to syslog [CCS] Use common syslog facility [FENCE] fenced: separate concept of fork and debugging allow fenced to fork when debugging is set from the configuration or the system will hang at boot. [FENCE] Allow fenced to configure logsys [QDISK] Set debug from syslog_level only when requested Make also sure to set val to NULL after some operations. [CCS] Set debug from syslog_level only when requested [FENCE] Move logsys configuration calls where they belong [FENCE] Make fenced ready to load logsys config [FENCE] Start porting fenced to logsys 2008-06-24 David Teigland gfs_controld: basic fixes Fix leave/unmount; weren't calling the function. Fix match_change for messages from new nodes that don't contain info for all members. 2008-06-24 Bob Peterson gfs2_fsck fails: Unable to read in jindex inode. 2008-06-24 ejhernandez@warp.es Low: pingd RA: replace su(1) with sudo(8) --HG-- extra : convert_revision : 1e014af9b3a7bfd138721e11ded274153c857713 2008-06-23 David Teigland dlm_controld/gfs_controld: minor fixes dlm_controld shouldn't close libdlmcontrol connections that are meant to be persistent. gfs_controld shouldn't call start_kernel every time through process_mountgroup() after it's completed. 2008-06-23 Bob Peterson savemeta was not saving gfs1 journals properly. 2008-06-23 Fabio M. Di Nitto [MISC] Logging: optimizing query sequence Query for debug info only if nothing is specified on the command line. [QDISK] Port qdisk to the new logsys config interface NOTE: this commit also retain backward compatibility with the old logging config options but warns the users that they are depracated. [CCS] Fix debug override from command line vs config [CCS] Always check for debug setting as first thing This allow us to enable and read debugging output as soon as possible. [BUILD] Fix new gfs_controld Makefile 2008-06-20 David Teigland gfs_controld: new version Uses libcpg directly instead of libgroup/groupd, like we've already done for fenced and dlm_controld. 2008-06-20 Lon Hohberger [rgmanager] Make rgmanager check pbond links correctly Rgmanager doesn't check pbond links (bonded links when Xen networking is used) correctly. This patch fixes it. Patch from John Ruemker. 2008-06-20 Fabio M. Di Nitto [QDISK] Major clean up Kill lots of dead and unused code around. Switch whatever possible into static functions. [QDISK] Init logsys later in the process [QDISK] Clean handling of debug envvar [QDISK] Remove duplicate debugging configuration [QDISK] get_config_data cleanup get_config_data does not need cluster name. change invokation from ccs_force_connect to ccs_connect. ccs_force_connect returns only when connection is succesful and can sit there forever. With the new libccs, if we are connected to cman, we can be 100% sure that we will be able to do a ccs_connect. [QDISK] Make get_config_data static [QDISK] Fix debug type 2008-06-20 Christine Caulfield [CMAN] use list_iterate_safe when removing nodes 2008-06-20 Fabio M. Di Nitto [QDISK] Fix build with new openais logsys [CCS] Convert ccs logsys config to the ais format [CCS] Fix improper log level on debugging information [CCS] Improve logsys init order If we can init and config from a real config file, do that and start logging immediatly. If we cannot, fall back to default and start logging. [CCS] Add cosmetic CCSENTER/EXIT for simple xml queries [CCS] Shrink more common code for internal xml queries [CCS] Init logsys as early as possible If we have an on-disk copy of cluster.conf we will use it as early as possible to configure logsys. If you are unlucky to have none, logsys will use built-in defaults and then switch to configured setting after we will get one from the network. [CCS] Remove LOG_MODE_DISPLAY_DEBUG from logsys settings [CCS] Remove duplicate code and make it common We are about to perform many queries to configure logging. Collect common code in one static function and switch set_ccs_logging to use it, by sharing the same XML context. [CCS] Fix a few logsys configuration bits Make sure to set debug = 1 when debugging is enabled via envvar. Delay logsys flush to catch a few more bits and allow us to configure properly. [CCS] Fix priority setting [CCS] Add missing CCSEXIT call 2008-06-20 Benjamin Marzinski [gnbd-kernel] bz 449812: disallow sending requests after a send has failed. This fix adds a "corrupt" flag to the gnbd device structure. This flag is cleared when a new socket connection is opened to the server. It is set whenever a send fails. After this all future sends will fail, and the receiver process will stop accepting replies as soon as it notices the flag. 2008-06-19 Benjamin Marzinski gnbd-kernel: Fix receiver race It is possible to have the gnbd receiver process finish and end a request before the sending process has finished using the request structure. This can cause a kernel panic. This fix adds a waitqueue (tx_wait) and a pointer to the request currently being send (current_request) to the gnbd device structure. current_request is set before any request is sent to the server. When the send is complete, it is cleared and the wait_queue is woken. A new function, wait_for_send() is called whenever it is possible for a call to gnbd_end_request() to interleave with a send. It waits on the waitqueue if the request about to be ended is currently being sent. Conflicts: gnbd-kernel/src/gnbd.c 2008-06-18 Bob Peterson 452004: gfs: BUG: unable to handle kernel paging request. This is a gfs crosswrite from gfs2, to be included with 446085 in RHEL5. 2008-06-17 Lon Hohberger Ancillary NOCLUSTER mode fixes for fence_xvmd Ancillary NOCLUSTER mode fixes for fence_xvmd Fix #362351 - make fence_xvmd work in no-cluster mode Conflicts: fence/agents/xvm/fence_xvmd.c fence/agents/xvm/options.c 2008-06-16 James Parsons Fix for 251358 2008-06-13 Bob Peterson Fix 32-bit warning in super.c. 2008-06-13 Fabio M. Di Nitto [CCS] Fix build warnings on sparc 2008-06-12 Bob Peterson Fix gfs_fsck build warnings Fix gfs_tool build warnings Ignoring gets return value in gfs_mkfs Fix gfs_debug build warning Fix build warnings from libgfs 2008-06-12 Christine Caulfield [CMAN] Fix some compiler warnings on 64 bit systems 2008-06-12 Ron Terry RA SysInfo (Novell 399497): get the number of CPUs correctly --HG-- extra : convert_revision : b6de0d1458c049f98e29f2447766838a6c0f2748 2008-06-12 Bob Peterson Fix another compiler warning for 32-bit arch. 2008-06-12 Fabio M. Di Nitto [QDISK] Add better support for Xen virtual block devices This change allows to detect Xen virtual disks directly into sysfsattrs.disk without the need of an external filter. [GFS2] Add missing include and fix build warning 2008-06-12 Bob Peterson Fix build warnings in gfs2-utils. 2008-06-11 Bob Peterson Added an optional block-size to mkfs.gfs2 2008-06-10 Fabio M. Di Nitto [MISC] Add another exception to COPYRIGHT [MISC] Remove old copyright [MISC] Add original author for cman/qdisk/disk.c [MISC] Remove osl-2.1 exception from README.licence [GFS] Remove obsoleted gfs_edit in favour of gfs2_edit 2008-06-09 Bob Peterson Fix compiler warning. Ability to specify starting block or structure with -s Allow keywords in block number input 2008-06-09 Fabio M. Di Nitto [MISC] Relicence rgmanager/src/resources/oracledb.sh under GPLv2+ There was no reason for this file to have a different licence. This kills the last exception in the tree (licence wise). [MISC] Whitespace cleanup [BUILD] Fix file permissions all around Let make/install.mk to take care of the install permissions. Don't mangle with tree permissions at all. [MISC] Update top level copyright file Add 2 missing exceptions and fix a typo [GNBD/FENCE] Move fence_gnbd agent where it belongs 2008-06-08 Fabio M. Di Nitto [BUILD] Prepare infrastructure for perl/python bindings 2008-06-06 Fabio M. Di Nitto [MISC] Tree cleanup Remove all dead code that has not been updated in ages or doesn't build and nobody knows what it is. This can be restored at a later stage from stable2 branch if required. [MISC] Cleanup licence, copyright and header duplication Add toplevel README.licence that explain what is what. Add copy of GPL-2 and LGPL-2.1 (as pointed by README.licence). Add toplevel COPYRIGHT that reports exact details of each file in the tree including authors. Cleanup all the headers around > 10K lines less of stuff to carry around and mind for cleanup. [MISC] Add top level licence files This is in preparation of a major file cleanup across the whole tree. [MISC] Remove obsolete and empty files 2008-06-06 Lars Marowsky-Bree RA: o2cb: Fix reference to o2cb init script (bnc#394666). Using o2cb is still not recommended! --HG-- extra : convert_revision : f4838a4d9cacdc706e451eb0bda7c2ba492b9ea2 2008-06-06 Fabio M. Di Nitto [BUILD] Collapse common library makefile bits in libs.mk Almost all libraries we build share the same Makefile bits: - Collapse them in one location. - Make it smarter to switch a library from shared to static and viceversa. - Convert all possible libs to use the new system. Fix libdlm linking order. Standardize invokation to AR. [BUILD] Switch libdlmcontrol back to shared library 2008-06-05 Marek 'marx' Grac Fixes #445662: names of resources with spaces are mishandled 2008-06-05 Lars Marowsky-Bree RA: Filesystem: Ensure that OCFS2 membership links are only removed on clean umount (bnc#389599) --HG-- extra : convert_revision : 56d709269fd129ca5b7698c37f8ce3b2779bbec8 2008-06-04 Bob Peterson Updates to gfs2_edit man page for new option. Make gfs2_edit more friendly to automated testing. Fix gfs2_edit bugs with non-4K block sizes 2008-06-04 Fabio M. Di Nitto [CMAN] Bump library version In preparation for 3.0 and to allow landing of perl/python bindings we need to set this to 3 to differentiate from stable2. 2008-06-04 Marek 'marx' Grac [FENCE] Fix: 447378: fence_apc unable to connect via ssh to APC 7900 Increased timeout from SHELL_TIMEOUT to LOGIN_TIMEOUT in login function. 2008-06-03 Fabio M. Di Nitto [CONFIG] Add full xpath support to libccs 2008-06-03 Dejan Muhamedagic RA pgsql: hint that notify/promote/demote are not implemented --HG-- extra : convert_revision : fc2c93c98591bf5c37086c505f66c57ed332ea39 2008-06-03 Fabio M. Di Nitto [MISC] Make several API's private again A bunch of API's have been exported and made public by mistake. libdlmcontrol, libfenced and libgfscontrol are now private again and no shared libraries are available. Make sure to uninstall the shared libraries from your system as they are not used anylonger. NOTE: make sure to clean your tree before you git pull and then re-configure (optional to get rid of old build variables). 2008-06-02 Mark Hlawatschek mount.gfs2: skip mtab updates Skip updates to /etc/mtab when it's a link to /proc/mounts (which is the case for shared root gfs, for example.) bz 318271 (RHEL5) 2008-06-02 Marek 'marx' Grac [FENCE] Fix #446995: Unknown option Previous patch worked just for command line and there was a problem with stdin argument. Typo fixed. 2008-06-02 Dejan Muhamedagic Low: remove haresources2cib (it's in pacemaker) --HG-- extra : convert_revision : 34d2cf58259f9b766e3d9e2356c7310c907b56fa 2008-05-30 Dejan Muhamedagic High: remove stonithd (which moved to pacemaker) --HG-- extra : convert_revision : 753ba3d6d2cd9f58864dce2f7029883d5ca4e6ae 2008-05-29 Fabio M. Di Nitto [BUILD] Fix mount.gfs2 build Remove unrequired SHAREDOBJS since there is only one target. Fix CFLAGS to use gfscontrolincdir. Statically link mount.gfs2 with libgfscontrol. While this is absolutely ugly, it is the best way to prevent a few tons of other problems. [BUILD] gfs2 requires group to build [BUILD] Change build system to cope with new libgfscontrol NOTE: you will need to rerun configure to set the new variables [GROUP] libgfscontrol: fix build with gcc-4.3 [GFS] remove symlink to umount.gfs2 umount.gfs2 is gone with commit d33f4f1df3e8f84603418d0192c1af18794d3136 2008-05-28 David Teigland gfs_controld: restructuring - copying the code structure/organization of dlm_controld - isolate the cluster2 code from what will be the cluster3 code - add libgfscontrol and gfs_control - use libgfscontrol between gfs_controld and mount.gfs - eliminate umount.gfs, no longer used gfs_controld: move recover.c Move recover.c into cpg-old.c in preparation for new version. gfs_controld: rename files Renaming files in preparation for new version. 2008-05-28 Ryan McCabe libfence: update copyright notice - Clarify copyright notice so that GPLv2 is specified explicitly. libfence: handle EINTR correctly - Handle EINTR correctly - String cleanups fence: fixes and cleanups to fencing.py library - Do not report failure if a node is successfully powered off, but fails to power on. - Change 'pass' to 'continue' so that comments and blank lines from stdin are ignored. - Use the full path to ssh and telnet when executing the binaries. 2008-05-26 Fabio M. Di Nitto [BUILD] Add fence_lpar fencing agent to the build system 2008-05-23 Marek 'marx' Grac [FENCE]: Fix #237266: New fence agent for HMC/LPAR 2008-05-23 Fabio M. Di Nitto [CONFIG] Fix lots of bugs in libccsconfdb Add tokenizer to split xpath queries into tokens..... Make some functions static since they are not exported. Simplify path_dive and get_data to use tokens. Cleanup comments. Implement a lot more error checking and be extremely picky on how queries are requested. Fix some string handling to avoid data corruption in result generators. 2008-05-22 Fabio M. Di Nitto [CCS] Use absolute path for queries * It is more efficient * It is a well-defined location 2008-05-22 Lon Hohberger [rgmanager] Use /cluster/rm instead of //rm * It is more efficient * It is a well-defined location 2008-05-22 Fabio M. Di Nitto [BUILD] Plugin the new shiny fence_ifmib agent [FENCE] Fix ifmib README to report the right fence agent [FENCE] Fix copyright header for fence_ifmib manpage 2008-05-22 Ross Vandegrift [FENCE] Add fence_ifmib new agent Many thanks to Ross Vandegrift for this submission. 2008-05-21 Marek 'marx' Grac [FENCE] Fix #447378 - fence_apc unable to connect via ssh to APC 7900 Problem was that even with ssh itself it was really painfull to log into this device (15 - 40 seconds). After specifying cipher and protocol we can login and check status in time comparable to others. 2008-05-21 Fabio M. Di Nitto [GFS2] Use proper include dir for libvolume_id [BUILD] Fix install permissions Virtually every distro was complaining about scripts permissions. Clearly we were doing it wrong. Fix them all. [BUILD] Fix rg_test linking rg_test does not need libccs [BUILD] Fix dlm_controld linking dlm_controld does not require linking with libdlmcontrol. Merge branch 'master' of ssh://sources.redhat.com/git/cluster [BUILD] Fix build order group needs cman and not just config to build 2008-05-20 Michael Schwartzkopff RA IPaddr2: fix IP_CIP_HASH default --HG-- extra : convert_revision : 9ea9d3cbe2063467ec9d3f80f541a1a4b691164d 2008-05-20 Christine Caulfield [CMAN] Don't busy-loop if we can't get a node name And remove a spurious error message. 2008-05-20 Bob Peterson bz 446085: Back-port faster bitfit algorithm from gfs2 for better performance. 2008-05-20 Lon Hohberger [rgmanager] Fix live migration option (broken in last commit) [rgmanager] Apply patch from Marcelo Azevedo to make migration more robust * Adds a mapping of cluster nodes to private hostnames for migration paths * Makes migration status reporting more robust 2008-05-20 Fabio M. Di Nitto [MISC] Update copyright 2008-05-20 Marek 'marx' Grac [FENCE] Fix #446995: Parse error: Unknown option 'switch=3' Support for APC MasterSwitch was added (it worked in original fencing agent). Option 'switch' on STDIN didn't have a getopt alternative, so '-s ' was added. Plug number notation : works as before. Original behaviour for missing switch number does not change. If there is just one MasterSwitch then we will set it up otherwise error is returned. 2008-05-20 Fabio M. Di Nitto [BUILD] Fix sparc #ifdef according to the new gcc tables 2008-05-19 Marek 'marx' Grac [FENCE] Fix #248609: SSH support in Bladecenter fencing (ssh) Complete ssh support for Bladecenter. You can use password or private key (identity_file on STDIN; -k in getopt) to login to system. This patch contains complete infrastructure (usable also by other agents). 2008-05-19 Fabio M. Di Nitto [GFS] Sync with gfs2 init script [INIT] Do not start services automatically Apply Fedora policy to not start services automatically to our init scripts. [GFS] Fix comment 2008-05-19 Bob Peterson Replace put_inode with drop_inode 2008-05-16 Fabio M. Di Nitto [BUILD] Stop using DEVEL.DATE library soname Start using the official sonames also for development trees. release version is there to catch the same information. [CCS] Make a bunch of functions static [CONFIG] Add missing Makefiles [CONFIG] Create config/ subsystem Move libccsconfdb to config/libs/ Move ccs_test to config/tools/ top level Makefile: - use config/ subsystem - reorder build dependencies - ccs is now on its own configure: - change default location for libccs Update Makefiles to cope with new locations [BUILD] Free toplevel config/ dir [BUILD] Add --without_kernel_modules configure option Allow users to disable build of kernel modules. 2008-05-15 Jonathan Brassow rgmanager/lvm.sh: HA LVM wasn't working on IA64 switch: $(find /boot/*.img -newer /etc/lvm/lvm.conf) to: $(find /boot -name *.img -newer /etc/lvm/lvm.conf) Could be better still if there was a way of knowing which initrd we used when booting... 2008-05-14 Fabio M. Di Nitto [DLM] Remove unused header file 2008-05-14 Marek 'marx' Grac [FENCE] Fix typo in name of the exceptions in fencing agents Exceptions should be pexpect.EOF, pexpect.TIMEOUT (not pexcept.*). This problem only occured in set_status(). Function get_status() contains correct exceptions. 2008-05-14 Fabio M. Di Nitto [FENCE] Rename bladecenter as it should be .pl -> .py 2008-05-14 Marek 'marx' Grac [FENCE] Fix problem with different menu for admin/user for APC In APC user/admin can see a different menu and they have to use different sequence of keystrokes to access Outlet Controls. Previously only support for user was provided. [FENCE] Fix name of the option in fencing library We were testing for option 'plug_no' but in every other file we have 'port'. 2008-05-14 Fabio M. Di Nitto [CMAN] Fix path to cman_tool 2008-05-13 Fabio M. Di Nitto [MISC] Cast some love to init scripts - Fix LSB headers - Use standard templates - Align runlevels - Add service dependencies - Fix subsystem usage [BUILD] Move fencelib in /usr/share Python bytecompiled objectes are arch indipendent. Move the whole thing where it belongs. 2008-05-13 David Teigland dlm_controld: use started_count to detect remerges Count then number of times the dlm has been started locally and include the count in the start messages sent to begin each change. I think this should work to detect remerges after transient partitions. For now it just logs a warning; after observing how it works we can use it to ignore remerged nodes. dlm_controld/gfs_controld: ignore write(2) return value on plock dev bz 446128 When plocks originate from nfs clients, the kernel mistakenly returns 0 instead of the number of bytes written to the plock device on write(2). Don't spam /var/log/messages with errors reporting a bad return value from write(2). 2008-05-13 Dominik Klein Medium: Fix return codes if installation directory/user/group/config is not found. --HG-- extra : convert_revision : 42ce605e3da516db5e0a69b92d6e27433537ab53 2008-05-13 Fabio M. Di Nitto [RGMANAGER] ^M's are good for DOS, bad for UNIX aka: don't edit shell scripts with notepad. thanks. [GFS] Make gfs build with 2.6.26 (DO NOT USE!) put_inode has been removed from the main kernel. gfs1 needs a full porting and review. This commit allows only to build the module but it will NOT work. [GNBD] Update gnbd to work with 2.6.26 [BUILD] Require 2.6.26 kernel to build 2008-05-12 David Teigland dlm_controld: remove unworking re-merge detection Comment out the code that's supposed to detect when nodes are being remerged after a transient partition; it's not smart enough yet. dlm_tool: refine list output dlm_controld: dlm_tool query fixes Various fixes to make queries, e.g. dlm_tool ls, show the right info. 2008-05-09 David Teigland dlm_controld: options to disable fencing/quorum dependency There may be cases where someone wants to use the dlm without a recovery dependency on fencing and/or quorum. dlm_controld: fix waiting for removed node When a node is removed by leaving (not failing), don't wait for it to be removed by cman/quorum, since it won't be. 2008-05-09 Christine Caulfield [CMAN] fix cman_tool join -X Even when state with cman_tool -X, the preconfig stage tried to lookup the node in the objdb (which is empty!) and fails if it was not found. The "aisexec" object did not exist so the username was not stored. 2008-05-09 Lon Hohberger [rgmanager] Fix #441582 - symlinks in mount points causing failures 2008-05-08 David Teigland daemons: queries Completing list command for fence_tool and dlm_tool. 2008-05-08 Marc - A. Dahlhaus [MISC] Add version string to -V options of dlm_tool and group deamons 2008-05-07 David Teigland daemons: mostly daemonization stuff - do lockfile before daemonizing - use daemon(3) - openlog even if not forking - other little odds and ends 2008-05-07 Dominik Klein Low: RA mysql: fix metadata --HG-- extra : convert_revision : 49b142475fa9925bb440d359816aacc1fe6c4495 2008-05-07 Fabio M. Di Nitto [MISC] Fix some gfs2 build warnings [BUILD] Fix install when building from a separate tree 2008-05-07 Christine Caulfield [CMAN] make qdisk compile on i386 The combination of -Werror and an attempt to print a size_t on several architectures was causing it not to build on i386. The cast isn't ideal, but it's only a log message. 2008-05-07 Fabio M. Di Nitto [MISC] Fix even more build errors with Fedora default build options ... and more to come [MISC] Fix more build errors with Fedora default build options NOTE: some bits are absolutely not required but we still add them to shut up the warnings. [MISC] Fix build errors with Fedora default build options [BUILD] Allow users to set path to init.d Almost all distributions use /etc/init.d but some still use the legacy path to /etc/rc.d/init.d. Allow builders to set the path instead of using some manual workarounds to do later fixup. 2008-05-06 David Teigland dlm_tool: add libdlmcontrol query commands Basic code for ls, dump, plocks, not yet tested. dlm_controld: code for info/debug queries 2008-05-06 Marek 'marx' Grac [FENCE] Fix #435154: Support for 24 port APC fencing device Fixed in the new version of python fencing agent. But there was still a problem because there is major change of interface between firmware v2.7.x and v3.5.x. After this patch both types of firmware version should work. 2008-05-06 Christine Caulfield [CMAN] Fix localhost checking that I broke last week. 2008-05-05 Fabio M. Di Nitto [CMAN] Set default syslog facility at build time [CCS] Fix build with gcc-4.3 2008-05-02 Fabio M. Di Nitto [CCS] Detach dependency on ccsd to run the cluster The old static libccs is now called libccscompat and should not be used by anything outside ccs/. Note that the headers and the library are not installed, nor configure knows about it. All application paths that require compat are made static and non-configurable. Allow ccsais and ccs_tool to use libccscompat. Stop linking ccsd with libccs. It was not required before either. Add new shared library called libccs that uses aisexec confdb directly. The new libccs retains API compatibility with the old library so it was possible to "unplug" the old and "plug" the new one in one go. All the applications in the stack are now quering aisexec db directly. NOTES: - the new library does not emulate all the libccscompat calls as some of them are not used anywhere but this is not a problem since libccs was static. We are shipping it now shared and we a new soname. - the library implements only a small subset of xpath to query the aisexec config db. Port ccs_test to use the new libccs and drop all calls that are not used anywhere. Set new default ccsincdir and ccslibdir to ccs/libccsconfdb in configure. NOTE to developers: you will need to rerun configure. [CMAN] Do not query ccs as it might not be the right config plugin 2008-05-01 David Teigland dlm_controld: filling out code Filling in various unfinished bits, fixing up wrong/inconsistent header length (also in fenced). 2008-05-01 Christine Caulfield [CMAN] Remove external dependancies from config modules This remove the external depandancies from the aisexec config modules because they break all sorts of things. Instead of logging errors, they return them back to the caller - this removes the dependancy on logsys. Also, I've remove the totemip function calls and replaced them with usable subsets in the config code itself. The cman and ccs config plugins now work with testconfdb in standalone mode. 2008-04-30 David Teigland libdlmcontrol: filling out code 2008-04-30 Fabio M. Di Nitto libdlm: fix libdlmcontrol in Makefile 2008-04-30 David Teigland dlm_controld: fix build problems in previous commit 2008-04-30 Marek 'marx' Grac [FENCE] SSH support using stdin options STDIN options have to be name=value even if they are just boolean. These options are taken from cluster.conf so they have to be XML-like. 2008-04-30 Fabio M. Di Nitto [BUILD] Change build system to cope with new libdlmcontrol 2008-04-29 David Teigland libdlmcontrol: new lib interface to dlm_controld Adds all the structure, most of the calls do nothing yet. libdlm: use linux/dlm.h from 2.6.26-rc Use new linux/dlm.h (which inclues dlmconstants.h) to give us DLM_LOCKSPACE_LEN. fence_tool: fix list command 2008-04-28 David Teigland fence: fence_tool list and fenced_domain_nodes() Let fence_tool query fenced for domain state. Change fenced_domain_members() to fenced_domain_nodes() to query for nodes other than members. 2008-04-28 Lon Hohberger [cman] Close sockets in error state in gfs_controld / dlmtest2 / groupd test 2008-04-28 Marek 'marx' Grac [RGMANAGER] Fixed typo in mysql.metadata Changed httpd to mysqld 2008-04-28 Fabio M. Di Nitto [CMAN] Setup logging file Align cman logging options with the other subsystems and set logging file to LOGDIR "/cman.log". It is still possible to disable the feature by using the standard to_file: no config option (default to yes). 2008-04-26 Fabio M. Di Nitto [BUILD] Fix build order. Gotta love circular build depends... 2008-04-25 David Teigland fenced: allow queries during fencing; group queries Put mutex unlock/lock around fencing steps that take a while so that the query thread won't be blocked. Fill in query info for libgroup mode. fenced: process queries in a thread Add a thread for responding to the new libfenced queries, since fenced blocks for long periods when fencing nodes. fence: using new libs Filling out various incomplete parts, making use of the new interfaces. 2008-04-25 Fabio M. Di Nitto [RGMANAGER] Fix uninstall target [FENCE] Enable new fence agents by default Remove enable_fence_experimental_agents configure option in favour of "crack_of_the_day" Rename files around as agreed with Marek. [GROUP] Apply patch to make gfs_controld work with 2.6.26 [BUILD] Fix fence lib install target [BUILD] Fix kernel check for good Use the top level Makefile of the source tree rather than random includes that keeps changing. [BUILD] Fix building with separate object dir [BUILD] Deal with new libfenced configure, make/defines.mk.input: - add fencedincdir and fencedlibdir options for the shared library. - set default for fenceincdir and fencelibdir to ./fence/libfence (renamed). fence/Makefile: - build the shared lib. fence/fence_node/Makefile, fence/fence_tool/Makefile: - use fencedincdir and fencedlibdir. - add build-dep on the shared library. fence/fenced/Makefile: - update depends on libfence (renamed). fence/libfenced/Makefile: - use path to source instead of direct include to fenced. fence/lib: - renamed to fence/libfence. 2008-04-24 David Teigland fenced: new libfenced interface A new library, libfenced, is used to communicate with fenced. Previously, programs would each open fenced's local socket and write/read strings. dlm_controld: build plock code Building this requires 2.6.26-rc kernels because of linux/dlm_plock.h. Once built, dlm_controld is backward compatible, and will process plocks from earlier kernel versions. 2008-04-24 Fabio M. Di Nitto [CCS] Allow ccsd logging level and facility to be set by cluster.conf This change allow to set log_level and log_facility for the ccs subsystem within cluster.conf. Here is a config example: .... [CCS] Document -d (debugging) switch [CMAN] Use build/user defined default logging facility [CMAN] Convert qdiskd to use logsys 2008-04-23 David Teigland fenced: more new devel New stuff still under development, lots of various things fixed, reworked, changed. 2008-04-23 Fabio M. Di Nitto [CCS] Switch to use user selected logdir and syslogfacility [BUILD] Allow users to set default log dir and syslog facility [BUILD] Fix install/uninstall targets for fence/agents/lib [BUILD] Deal with the new libfence properly configure, make/defines.mk.input: - rename fencelibdir to fenceagentslibdir to avoid name space collision. - add fenceincdir and fencelibdir options for the shared library. make/fencebuild.mk, make/install.mk, make/uninstall.mk, fence/agents/apc/apc.py, fence/agents/bladecenter/bladecenter.py, fence/agents/drac/drac5.py, fence/agents/ilo/ilo.py, fence/agents/lib/Makefile, fence/agents/wti/wti.py: - rename fencelibdir to fenceagentslibdir. fence/Makefile: - build the shared lib. fence/fence_node/Makefile, fence/fenced/Makefile: - use fenceincdir and fencelibdir. - add build-dep on the shared library. fence/fence_tool/Makefile: - remove obsolete depends target. fence/fenced/fd.h: - include "libfence.h" to fix implicit declaration when building fenced. fence/lib/Makefile: - build also static version of libfence. - move ldflags at the end of the linking invokation call. 2008-04-22 David Teigland fenced: new version In the same theme as the new version of dlm_controld. - uses libcpg directly without libgroup (use the -g0 option) - runs in backward compat mode by default, using libgroup to interact with old groupd/fenced (-g1 option) - move code that runs agents (agent.c) into libfence 2008-04-22 Andrew Beekhof Low: RA: Stateful - Set master preference in stop/start/promote/demote as a real RA would --HG-- extra : convert_revision : 1cfe8f4589bcc7c76f2bb8868272b07bfabafe3e 2008-04-22 Jonathan Brassow rgmanager/lvm.metadata: Fix parameter description fields Descriptions were never properly updated after creation. 2008-04-22 Fabio M. Di Nitto [MISC] Update Red Hat main copyright file [MISC] Update copyright headers 2008-04-22 Andrew Beekhof Medium: RA: Stateful - Allow instances to be promoted by setting a master preference --HG-- extra : convert_revision : 0daebc274f471273f591b52d9d4f69db7f18f74d Medium: RA: drbd - Ensure the master preference is set/deleted consistently --HG-- extra : convert_revision : 939d04c33a42e67eb342585ddbd5ed6d0bd6c302 2008-04-22 Fabio M. Di Nitto [CCS] Convert to logsys [CCS] libraries should never log 2008-04-22 Benjamin Marzinski The gnbd kernel module on 64 bit architectures didn't handle ioctls from 32 bit userspace processes. Now it does. Resolves: bz #440454 Also, I was getting an error because on ppc64, the manual definition of O_DIRECT in gnbd/server/device.c was incorrect. I switched to defining _GNU_SOURCE, which should mean that O_DIRECT will be automatically defined. 2008-04-22 Lon Hohberger Remove clushutdown man page references from clusvcadm.8; resolves #324151 2008-04-22 Fabio M. Di Nitto [rgmanager] Remove obsolete clushutdown utility Merge from RHEL5 branch. 2008-04-21 Andrew Price [GFS2] gfs2_edit: Remove duplicate linux_endian.h gfs2/edit/linux_endian.h is an exact duplicate of gfs2/include/linux_endian.h and can be removed as gfs2/include/linux_endian.h is picked up instead. 2008-04-21 Christine Caulfield [CMAN] Disallow a new dirty node from joining the cman cluster Patch from David Robinson, bz#443358 2008-04-18 Bob Peterson bz295301: Need man page for gfs_edit 2008-04-18 Lon Hohberger [fence] Close file descriptors that are in invalid/error states If poll was returning with a file descriptor noted as active, but with the POLLERR/POLLNVAL flags set (but not POLLIN or POLLHUP), fenced and groupd would enter a tight spin loop. This fixes that condition. 2008-04-17 Andrew Beekhof Low: Hg: Merge back changes from the 'test' repo for 2.1.3 --HG-- extra : convert_revision : 6de5bc38e86076d2d2148a7415c168f8f5cc8ea4 2008-04-16 Abhijith Das gfs2_tool: Fix build warnings in misc.c bz 441636 gfs2_tool used to include both linux/fs.h and sys/mount.h that caused some symbols to be defined twice and hence caused some build warnings. This patch removes #include linux/fs.h and moves all the required definitions from there to a new local header file iflags.h. This patch also removes the SYSTEM and DIRECTIO flags as they are not used anymore. 2008-04-16 Fabio M. Di Nitto [GFS2] Fix build warning 2008-04-16 Andrew Price [GFS2] Remove unrequired header file "grep -nr 'list\.h' gfs2/" found that this file was not included anywhere and removing it does not stop any of the gfs2 utils from building so it seems that it can be removed. 2008-04-16 David Teigland gfs_controld: retry recovery for withdrawn journal bz 442451 This is unfortunate, but seems to be the best solution available. The problem, described more fully in the bz, is that when gfs_controld tries to do recovery on a journal for a withdraw, the withdrawing node may not yet have cleared its dlm locks. This means the journal lock may still be held by the withdrawing node, causing all the recovering node(s) to fail acquiring it, and no one does the recovery. The solution is for all recovering nodes to retry recovery of a withdrawn journal until they succeed (only the first to get the journal lock will actually recover it, the others will see it's recovered and report success.) 2008-04-16 Lon Hohberger [fence] Preliminary TPS/NBB/NPS support in new WTI agent. 2008-04-16 Fabio M. Di Nitto [RGMANAGER] Fix build with gcc4.3 [BUILD] Fix clean target Make sure to remove also Module.markers files created by Fedora 9 kernel build system. 2008-04-16 Christine Caulfield [FENCE] Make it build with gcc 4.3 fence_tool.c also needs [MISC] Make it build with gcc 4.3 A few files now need to include and cman was using an illegal access into the sockaddr_in6 structure. 2008-04-16 Fabio M. Di Nitto Revert "gfs2_tool: Fix build warnings in misc.c bz 441636" This reverts commit d2a926d2122c23e6175a62326b5e2b421b842a93. 2008-04-16 jparsons Bump MAX_DEVICES in fenced from 4 to 8 Addresses request in rhbz#284701 2008-04-16 David Teigland gfs: don't cancel glocks when writing to hidden file bz 438268 When glock.c sees the PRIORITY flag, it cancels any outstanding glocks prior to doing the lock request. lock_dlm also uses the PRIORITY flag to give give granting priority to the lock in the dlm. Both of these are necessary when the PRIORITY glock is used for recovery, but only the second is wanted (neither is really needed) when writing to a hidden file. A new GL_NOCANCEL_OTHER flag, combined with PRIORITY, is used to tell glock.c to not do the cancels. 2008-04-15 David Teigland dlm_controld: max name length sanity Define MAX_LS_NAME 64, and note that it should match MAX_LOCKSPACE_LEN in dlmconstants.h. Including linux/dlm.h directly is difficult because some files need to include libdlm.h which doesn't combine nicely with linux/dlm.h. libdlm: max name length sanity Attempting to bring some sanity to handling of max lockspace name length and max resource name length. A new kernel patch creates single authoritative definitions in linux/dlmconstants.h: define DLM_LOCKSPACE_LEN 64 define DLM_RESNAME_MAXLEN 64 These definitions are copied in libdlm.h so that user apps don't need to include the kernel header. libdlm itself uses the dlmconstants.h definitions, and now checks both resource and lockspace name params against these definitions before calling into dlm-kernel. dlm-kernel checks each of the input names against these definitions when creating a new rsb or ls. 2008-04-15 Bob Peterson bz438762: gfs_tool: Cannot allocate memory 2008-04-15 Andrew Price [GFS2] gfs2_fsck: Fix operation on 'ptr' may be undefined warnings Occurrences of *ptr++ in fs_recovery.c made gcc throw up "operation on 'ptr' may be undefined" warnings. This patch disambiguates those occurrences. 2008-04-15 Fabio M. Di Nitto [GROUP] Fix building with standard kernels > From: David Teigland > Subject: [Cluster-devel] kernel for building master > This commit assumes dlm kernel changes that are only available in > linux-next or linux-mm (linux/dlm_plock.h). This goes against our aim to > keep master building against -rc kernels by default, so the following > patch disables the relevant part for now. Fabio has said he may turn this > ifdef into something more sophisticated. Apply patch from David to allow dlm_controld to build again against vanilla kernels. Change the patch to use the EXPERIMENTAL_BUILD infrastructure. Cleanup a couple of typos from the patch. [BUILD] Set automatically cflags when building experimental bits [BUILD] Fix typo [BUILD] Add --enable_crack_of_the_day configure option This option should NEVER be used lightly and it's there as facility for developers that need/want to commit experimental code that, for one reason or another, is not ready for general use or it depends on other code that is not available mainline yet. 2008-04-14 Bob Peterson bz425421: gfs mount attempt hangs if no more journals available 2008-04-14 Fabio M. Di Nitto [GFS2] Fix build warning 2008-04-14 Christine Caulfield [CMAN] Save the new expected_votes when a node is removed When a node leaves the cluster using 'cman_tool leave remove' it reduces the quorum of the cluster to accomodate the loss of the node. Unfortunately the following transition messages raise quorum back to its original value again because of the bug fix for bz#308581, so it appears that the remove hasn't worked. 2008-04-13 Bob Peterson bz440896/440897 GFS: gfs_fsck should repair gfs_grow corruption (see bug #436383) 2008-04-11 David Teigland dlm_controld: quorum checking Fill out the quorum dependency checking, and refine structure of the fencing and fs dependency checking which don't actually work yet. dlm_controld: new version - uses libcpg directly without libgroup (use the -g0 option) - takes over plock handling from gfs_controld - interacts with fenced and fs_controld to coordinate recovery (todo) - runs in backward compat mode by default, using libgroup to interact with old groupd/dlm_controld (-g1 option) - plan to add a new default -g2 option that will detect old groupd's in the cluster and only run in old mode if any exist 2008-04-11 Abhijith Das gfs2_tool manpage: Updates to the manpage for bz441636 gfs2_tool: Fix build warnings in misc.c bz 441636 gfs2_tool used to include both linux/fs.h and sys/mount.h that caused some symbols to be defined twice and hence caused some build warnings. This patch uses linux/ext3_fs.h instead of /linux/fs.h and uses EXT3_XXX_FL inode flags instead of the respective FS_XXX_FL flags. This patch also removes the SYSTEM and DIRECTIO flags as they are not used anymore. 2008-04-11 Bob Peterson Fix some compiler warnings in gfs2_edit gfs2_edit was not recalculating the max block size after it figured that out. Fix gfs2_edit print options (-p) to work properly for gfs-1 rgs and rindex. Also fixed rgflags option for gfs1. Fix savemeta so it saves gfs-1 rg information properly Also add savergs option to facilitate rg-only repairs. 2008-04-11 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : c2b66f3f7ec0c00bc3852fc8d567393bf7936f21 Low: Build: The AC_SUBST is only used for @BLAH@ expansion, not the setting of #defines in config.h --HG-- extra : convert_revision : eba5a49dc54f1692cf56ad264ded170fc002a085 2008-04-09 Fabio M. Di Nitto [KERNEL] Update modules to build with 2.6.25 Update clean target to cope with a new file that Kbuild creates at build time. Bump minimum kernel requirements to 2.6.25. Port modules to new kobj api. 2008-04-09 Christine Caulfield Remove references to broadcast. Remove references to CCSD. Fix entry for nodeid which seems to have got split up! 2008-04-09 Ryan O'Hara BZ 441323 : Redirect stderr to /dev/null when getting list of devices. 2008-04-09 Chris Feist Added back in change to description line to make chkconfig work properly. 2008-04-09 Ryan O'Hara BZ: 373491, 373511, 373531, 373541, 373571, 429033 BZ: 373491, 373511, 373531, 373541, 373571, 429033 - Prevent "reservation conflict" messageswhen scsi_reserve starts. - Leave the fence domain if scsi_reserve fails to register with any device. - Improve logging in scsi_reserve script. - Use "locking_type = 0" for all lvm commands (ie. vgs). - Fix SCSI reservations scripts to handle LVM mirrors and stripes. - Not an error if fence_scsi attempts to remove a non-existent key from a device. BZ 248715 - Use cluster ID and node ID for key rather than IP address. 2008-04-09 Ryan McCabe fix bz277781 by accepting "nodename" as a synonym for "node" 2008-04-09 Ryan O'Hara Fix help message to refer to script as 'fence_scsi_test'. Attempt to register the node in the case where it must perform fence_scsi fencing but is not registered with the device(s) that must be fence. With SCSI persistent reservations, in order to do a "preempt and abort" (which we are using to fence a node), the node doing this operation must be registered with the device. This fix will check to see the the node that is performing the fencing is registered with the device(s). If it is not, then it attempts to register with the device(s) so that it can then continue with the fence operation normally. Note that this situation should never happen, especially if things are configured properly. Allow 'stop' to release the reservation if and only if there are no other keys registered with a given device. Prior to this fix, if was not possible for 'scsi_reserve stop' to unregister/release on the node that was holding the reservation. Record devices that are successfully registered to /var/run/scsi_reserve. Rewrite of get_scsi_devices function. It is no longer possible to use lvs to get a list of cluster volumes (and underlying devices) at fence time. For this reason we must "keep state" by recording which devices we register with at startup. The init script (scsi_reserve) will record each device it successfully registered with to a file (/var/run/scsi_reserve). Then, and fence time, the fence_scsi agent will unregister each device listed in the state file. Fix success/failure reporting when registering devices at startup. If our node (key) is already registered with a given device, do not report failure since this is misleading. Replace /var/lock/subsys/${0##*/} with /var/lock/subsys/scsi_reserve. Fix split calls to be consistent. Remove the optional LIMIT parameter. Fix code to use get_key subroutine. Fix sg_persist commands to specify device via -d parameter. Remove "self" parameter. This was used to specify the name of the node performing the fence operation, and was passed to the agent. This is no longer used. Instead, we get the name of the local node in the agent by parsing the output from 'cman_tool status'. Fix unregister code to report failure correctly. Variable should be quoted in conditional statement. 2008-04-09 Fabio M. Di Nitto Revert "fix bz277781 by accepting "nodename" as a synonym for "node"" This reverts commit 57a07697afeb2e5d3bb2a4220e844bcdc44598cb. Revert "Fix help message to refer to script as 'fence_scsi_test'." This reverts commit e9b17a088668ceaba5c6ea35f8d22ec2613cfe96. 2008-04-08 Andrew Price [[BUILD] Warn and continue if CONFIG_KERNELVERSION is not found Currently the configure script assumes that CONFIG_KERNELVERSION is defined in autoconf.h. This patch handles the case where it isn't defined there. 2008-04-08 Fabio M. Di Nitto [BUILD] Fix clean target for experimental fence/agents/lib 2008-04-07 Abhijith Das gfs2_tool manpage: gfs2_tool counters doesn't exist anymore. This patch reflects the removal of the 'counters' command from gfs2_tool. bz 438759 gfs-kernel: fix for bz 429343 gfs_glock_is_locked_by_me assertion This assertion shows up when gfs_readpage gets called without the inode glock being held through the madvise syscall when the kernel attempts to readahead. This patch unlocks the page, locks the inode glock and returns AOP_TRUNCATED_PAGE. I had to change gfs_glock_is_locked_by_me() to return the holder if glock is held or NULL otherwise (instead of the TRUE/FALSE integer value it used to return earlier). I also added a new GL_READPAGE flag. If we need to get an inode glock in gfs_readpage(), this flag is set on the holder. We must not unlock another holder that we might have had on the glock before we entered gfs_readpage; checking for this flag before unlocking ensures that. 2008-04-05 Abhijith Das gfs2_tool: remove 'gfs2_tool counters' as they aren't implemented anymore gfs2 doesn't implement counters anymore so we remove them. BZ 438759 came up because we gfs2 stopped implementing counters. Conflicts: gfs2/tool/Makefile 2008-04-03 Jonathan Brassow rgmanager/lvm.sh: Minor comment updates Just moving/expanding some comments. 2008-04-03 HIDEO YAMAUCHI RA oracle: fix a typo --HG-- extra : convert_revision : c03bb0093041c6abaa200d056b8d6be145ff3419 2008-04-02 Bob Peterson Resolves: bz 436383: GFS filesystem size inconsistent 2008-04-02 Jonathan Brassow rgmanager/lvm.sh: change argument order of shell command Nice to have output redirect at end of line... cosmetic change only. 2008-04-01 Jonathan Brassow rgmanager/lvm.sh: Fix bug bz242798 Allow a machine to fence itself in the event that it cannot deactivate logical volumes. (The user must explicitly enable this option.) This is useful in cases where one machine in the cluster looses connectivity to its resources, but the others don't. The machine fences itself and the service moves to another machine. 2008-04-01 Fabio M. Di Nitto [FENCE] Make sure to version and copyright all built files Like all fence agents, those info can be useful. Store them in the modules even if only the libs print them out. [FENCE] Fix fencelib to pring version and copyright [BUILD] Enable build and install of experimental fence agents [BUILD] Royal cleanup of the fence agents build system Collapse fenceperl and fencepy into fencebuild. fencebuild now use a much simpler and slightly more clever shell script scripts/fenceparse instead of scripts/define2var that was no longer used anywhere in the tree. update all the fence/agents/ Makefile to use fencebuild.mk. make fencebuild understand more than one target at a time. [FENCE] Remove obsoleted fence_apc perl implementation [FENCE] Move apc_snmp README where it belongs [FENCE] Move apc_snmp README where it belongs [BUILD] add enable_experimental_fence_agents configure option [BUILD] Add fencelibdir support The new fence agents share a common python library. Make it's location configurable at build time (default to /usr/lib/fence). Update all targets required to handle it. Update the agents accordingly. NOTE: you will need to re-run configure to propagate the new configuration option. The new agents are not used/build/installed yet. This will happen shortly. NOTE to packagers: the new library depends on pexpect. [BUILD] Fix fenceperl and fencepy make snippets to allow multiple targets 2008-03-31 Lon Hohberger [CMAN] Make cman init script start qdiskd intelligently Resolves qdiskd/cman start ordering so gfs file systems are mounted reliably in 1-node+qdisk boot situations. 2008-03-31 Dejan Muhamedagic RA oracle/oralsnr: handle instances which are in the backup mode Two new attributes are defined: - clear_backupmode (boolean): if set to true, then check if the instance is in the backupmode and do alter database backupmode end when appropriate - shutdown_method (checkpoint/abort or immediate): shutdown method Also, 'su orauser' is replaced with 'su - orauser' when invoking oracle specific programs in both oracle and oralsnr. A typo fixed in oralsnr. --HG-- extra : convert_revision : 38ce26c62e352b42decc683edb6ccd2d4389eaa7 2008-03-31 Lon Hohberger Revert "[CMAN] Make cman init script start qdiskd intelligently" This reverts commit 2b6e388e72e6a35f2133c98a714dc49ab747cf80. Need review. [CMAN] Make cman init script start qdiskd intelligently Resolves qdiskd/cman start ordering so gfs file systems are mounted reliably in 1-node+qdisk boot situations. 2008-03-29 Dejan Muhamedagic RA Xen: fix default in name processing --HG-- extra : convert_revision : 1f9894aaa472987ea8b9fe2c1f3b5bef9cefce82 2008-03-29 Fabio M. Di Nitto [BUILD] Remove extra debugging entry [FENCE] apc_snmp: allow paths to snmp binaries to be configurable [FENCE] Enable fence_apc_snmp Separate fence_apc_snmp from apc in its own directory. Update the build system to understand --mibdir and propagate it down to fence_apc_snmp and the install targets. 2008-03-28 Marek 'marx' Grac fence/agents: Add obsolete options Added obsolete/dual options like hostname/ippaddr, action/option. These options are either translated to new or they are not used anymore/yet in new agents (eg. version of firmware). fence/agents: WTI agents merged Fencing devices from WTI can use password or login/password for loging in. This patch merge two fencing agents together so it will be same as old agent. fence/agents: New fencings agents There are new fencing agents based on a new library. They need a 'pexpect' package. If it is possible there is support for both telnet and ssh. In this patch there are agents for: APC, BladeCenter, Drac 5, ILo and WTI. It is possible that backward compatibility is broken (to be fixed). 2008-03-28 Fabio M. Di Nitto [CMAN] qdisk: add credits to Joel [CMAN] Do not duplicate entries in the objdb A dump-db did show duplicate entries of logging and totem because we were not looking for previous entries correctly. Make sure to reset to the proper handle before searching. 2008-03-27 Fabio M. Di Nitto [CMAN] Fix config handling Some objects were not relocated properly and it was breaking logging among other bits. 2008-03-26 Jonathan Brassow rgmanager/lvm.sh: Fix bug 438816 Missing VG name parameter to 'vgchange --deltag' call caused tag to be remove from all VGs, not just the intended one. Thanks to Simone Gotti for the patch. 2008-03-26 Fabio M. Di Nitto [BUILD] Fix man page install permission Man pages were installed -m755 that is wrong. Install them as -m644 as it should be. 2008-03-25 Bob Peterson Update to prior commit for bz431945: I forgot that STABLE2 does not have a diaper device. 2008-03-24 Lon Hohberger [cman/qdisk] Fix type pun errors in proc.c 2008-03-24 Dejan Muhamedagic RA Xen (LF 1858): wait for the domain to finish starting in the start operation --HG-- extra : convert_revision : 17c0cf487322287d0689a036c32f21b900ce5a80 RA Xen (Novell 369724): multiple issues - add a name attribute if one can't rely on config parsing; a must if the config is xml - deal properly with non-existing config files in particular if they reside on shared storage --HG-- extra : convert_revision : 30fc4793b31164300370d7b9ffa2dac4ec4e8e09 2008-03-24 Sebastian Reitenbach RA Xen (LF 1830): redirect xm list errors to /dev/null --HG-- extra : convert_revision : 487000f75735be59114726537b13c1fe656e5ff8 2008-03-24 Fabio M. Di Nitto [BUILD] Set -MMD as default CFLAGS Enable non-system header files dependency tracking by using -MMD in combination with -include .d files. This change will allow developers to edit an header file and simply fire a make to rebuild all the objects that are affect by the header change. Before this change a make clean and make were required to propagate the header change. NOTE to developers: you will need to rerun a configure to set -MMD in the default CFLAGS. All directories that builds objects now have a bunch of extra .d files that can be safely ignored as they are automatically generated and ignored by git. [BUILD] Update .gitignore for .o and .d files 2008-03-21 Fabio M. Di Nitto [BUILD] Fix handling of version and libraries soname The overall handling of release_major/minor/micro was wrong and it was discovered only after the first STABLE2 release. This change obsoletes release_major/minor/micro and introduces 3 new configure options: release_version, somajor and sominor. And add options to make/official_release_version. somajor/sominor should only be numbers (no sanity check is performed) and they can either be specified when invoking configure (both must be present at the same time and they will override official_release_version) or in make/official_release_version as: SONAME "2.2" If not specified anywhere they will be generated as before. release_version can now be a random string that will be shown when asking for tool versions. When specified from the configure it will override the one specified in make/official_release_version as previously documented (ex: VERSION "2.02.01"). If not specified anywhere it will be generated by somajor.sominor. NOTE to developers: you will need to rerun configure to regenerate make/defines.mk. NOTE to packagers: sorry about this intrusive change but the problem was noticed too late. Removing --release_major/minor/micro from your configure invokation should be enough and usually those values should be set by upstream. While we provide an easy option to override values, please try to keep them consistent with upstream for easy tracking of bugs vs released versions. NOTE to release manager: make/official_release_version is not tracked in git and should now contains 2 keywords to work properly (order is not important). [CMAN] Fix building when -DDEBUG is not specified [CMAN] Drop dependency on libdevmapper 2008-03-20 Lon Hohberger [cman] Apply missing fix for #315711 [cman] Make mkqdisk print all device paths [cman] Fix qdisk Makefile / disk_util merge bugs [cman] Merge scandisk & fixes from RHEL5 branch Scandisk replaces the old scanning of /proc/partitions with a flexible library created by Fabio M. Di Nitto Fixes from the RHEL5 branch include: * ability to use block devices with >512 byte sector size * bug causing infinite "Node X is undead" messages 2008-03-20 Christine Caulfield [DLM] Mention lidlm_lt in the man page 2008-03-20 Lon Hohberger [rgmanager] Remove unused lockspace.c file 2008-03-20 Lars Marowsky-Bree RA: Xen: Failing monitor call-outs would incorrectly cause the rsc to be treated as stopped (LF#1856) --HG-- extra : convert_revision : ee683ad12afa4d6d5ba2db4ee1922cc54f6a0e85 2008-03-19 Joel Becker libdlm: Don't pass LKF_WAIT to the kernel libdlm is passing LKF_WAIT to the kernel. In the kernel, the unlock path strictly audits flags, and errors on this unknown (to the kernel) flag. The correct answer is to keep the flag in userspace. 2008-03-19 Lon Hohberger [rgmanager] Fix #432998 Fix a bug causing incorrect return codes during service stop operations 2008-03-19 Bob Peterson Merge branch 'master' of ssh://sources.redhat.com/git/cluster into master.bz431945 Resolves: bz 431945: GFS: gfs-kernel should use device major:minor Resolves: bz 421761: 'gfs_tool lockdump' wrongly says 'unknown mountpoint' re HP cciss RAID array 2008-03-19 Christine Caulfield [CMAN] valid port number & don't use it before validation cman_send_data didn't validate the port number. This was not a great problem, but it was slightly silly. cman_start_recv_data used the port number before validation in a debug message, this could crash the server if debug is enabled and a very large port number was passed in. [CMAN] Don't declare a variable in the middle of a block Later gccs seem to allow this behaviour (and leaving it like that was a mistake on my part) but I've moved the declaration of 'qm' back to the start of the block for normality's sake. 2008-03-18 Christine Caulfield [CMAN] Limit outstanding replies This commit imposes a limit on the number of outstanding replies that a connection can have. This is to prevent a DoS attack that causes cman to eat all available memory by sending lots of requests and not reading the replies. The deafult is 128, it can be set in cluster.conf as [CMAN] Free up any queued messages when someone disconnects When a client disconnects we need to go through the list of queued replies to get rid of any that have not been collected. [CMAN] Make cman cope with the new objdb structure Now the /cluster bits are held on the objdb below "cluster" we need to look for everything (well, nearly everything) under there. Also add a 'cman_tool dump-db' command which is only built in DEBUG mode. [CCS] Fix the config loader for good We were removing the "/cluster" top level domain to allow /cluster/totem and other random ais bits to be at the top level of the objdb since aisexec expects to find them there. This is not clean in several way because it would break queries like: /cluster/child::* by returning a bunch of top levels objdb that are internal to aisexec and we don't want to expose to everybody. This is achieved by doing a few tricks here and there.. best would be to allow objdb symlinking. First we load the special bits for aisexec at the top level by using the right path within the xml config and then we load (again) the whole configuration. The final result will have some duplicates entries but for now we can live with that. The overhead is minimal. 2008-03-14 Bob Peterson Resolves: bz 435917: GFS2: mkfs.gfs2 default lock protocol differs from man page 2008-03-14 David Teigland libdlm: fix lvb copying When a program does a lock operation that reads an lvb, libdlm copies the lvb data from a bogus location instead of from the proper offset in the buffer it just read. The location of the lvb data is calculated wrongly due to a missing cast. 2008-03-14 David Lee config: improve 'REBOOT_ARGS' check --HG-- extra : convert_revision : 6a6ba4591f0314492def7256edfa051613e19d81 2008-03-13 David Teigland dlm_tool: print correct rq mode in lockdump The rq mode in a lockdump was incorrectly showing up as NL for granted locks instead of IV (invalid). 2008-03-13 Christine Caulfield [DLM] Don't segfault if lvbptr is NULL Calling dlm_lock* with LKF_VALBLK and sb_lvbptr set to NULL could cause libdlm to segfault. Now it returns -1/EINVAL 2008-03-12 Lars Marowsky-Bree RA: Update SAPInstance to 1.91 - Added support for SAP WebAS Java 7.1 --HG-- extra : convert_revision : d8ae7c67a6bcca0c3e15b9778d66610736188817 RA: Update SAPDatabase to 1.91 (alexander.krauth@realtech.com) - Added support for SAP WebAS Java 7.1 - Fixed verification of database startup of J2ee-only database. - Fixed status method for dbtype ADA, read process owner from /etc/opt/sdb. --HG-- extra : convert_revision : 81cbad41539a01d1981f6cf857c95c2eaa4ffae6 2008-03-12 David Lee ccdv: maintain via autotools; ensure built before anything else --HG-- extra : convert_revision : ee37f934ba1c8df40cff3f5835972684d6dbcf2a 2008-03-11 Lon Hohberger [rgmanager] Set cloexec bit in msg_socket.c Resolves Red Hat bugzilla #433313. Rgmanager did not have the close on exec bit set for any sockets it was managing, causing problems with SELinux [rgmanager] Make ip.sh check link states of non-ethernet devices Resolves: Red Hat Bugzilla #331661 2008-03-11 David Teigland groupd: purge messages from dead nodes bz 436984 In the fix for bug 258121, 70294dd8b717de89f2d168c0837c011648908558, we began taking nodedown events via the groupd cpg, instead of via the per group cpg. Messages still come in via the per group cpg. I believe that that opened the possibility of processing a message from a node after processing the nodedown for it. In Nate's revolver test, we saw it happen; revolver killed nodes 1,2,3, leaving just node 4: 1205198713 0:default confchg left 3 joined 0 total 1 1205198713 0:default confchg removed node 1 reason 3 1205198713 0:default confchg removed node 2 reason 3 1205198713 0:default confchg removed node 3 reason 3 ... 1205198713 0:default mark_node_started: event not starting 12 from 2 2008-03-11 Lon Hohberger [rgmanager] Don't call quotaoff if quotas are not used [CMAN] Fix "Node X is undead" loop bug This was caused by an improper assignment to ps_incarnation after a node decides to evict another node. The fix is to simply make the internal (memory) assignments before calling qd_write_status() 2008-03-11 Fabio M. Di Nitto [CCS] Fix possible memory corruption on double free 2008-03-10 Fabio M. Di Nitto [CCS] Cleanup duplicate vars from previous commit [CCS] Fix xml -> objdb config import 2008-03-10 David Lee portability: repair 'ccdv' maintenance (see gcc, 29/02/2008) with a little more resilience --HG-- extra : convert_revision : c8aa3a3586f87cd4a1a7a837e224f2eb3d3ae856 2008-03-08 Fabio M. Di Nitto [BUILD] Allow release version to contain padding 0's 2008-03-07 Dejan Muhamedagic RA: add the OCF RA DTD ra-api-1.dtd --HG-- extra : convert_revision : 854c65db6e062cb002235a11e9182bbb2127e11b RA ids: allow meta-data and usage under all circumstances --HG-- extra : convert_revision : c23351f48a12b8acf9b9d27a7298c52a7ab5518f RA Xen: allow meta-data and usage under all circumstances --HG-- extra : convert_revision : 7f953187b5a1df631f55daaeddef23427f031746 2008-03-07 Fabio M. Di Nitto Add toplevel .gitignore Ignore: make/defines.mk [CCS] Upload all subsystem configs into objdb By adding the the whole cluster.conf to the objdb, we can start moving all the subsystems away from using ccs directly and ask cman for config bits. [BUILD] Fix configure script to handle releases Add concept of release_micro value used only for RELEASE_VERSION. Attention packagers: release_micro is a mandatory value. Add support for encoding release versions in the tarball by adding a file at release time: cluster/make$ cat official_release_version VERSION "2.2.10" will automatically set the release to 2.2.10 with library sonames to 2.2 NOTE: manual values will always override whatever is set by default. 2008-03-04 Lon Hohberger [fence] Make fence_xvmd support reloading of key files on the fly. Merge branch 'master' of ssh://lhh@sources.redhat.com/git/cluster Add / fix Oracle 10g failover agent Update changelog Add Sybase failover agent 2008-03-04 Xinwei Hu STONITH hmchttp: new STONITH external plugin External STONITH module for HMC web console --HG-- extra : convert_revision : 75c5e3d30dfbadbf171da98d3156ca29587176ca 2008-02-29 Ryan McCabe Merge branch 'master' of ssh://sources.redhat.com/git/cluster Feeling pedantic. More spelling fixes. 2008-02-29 Lon Hohberger Merge branch 'master' of ssh://lhh@sources.redhat.com/git/cluster Fix #435189 - fenced override doesn't allow rgmanager to recover because it doesn't tell cman that fencing was completed. 2008-02-29 Ryan McCabe Merge branch 'master' of ssh://sources.redhat.com/git/cluster Fix a few misspellings 2008-02-28 Christine Caulfield Initialise votes to 0 The code in get_cman_join_info() expects the local variable 'votes' to be initialised to zero, but it wasn't being. Fix multicast display in 'cman_tool status' Due to misreading of a man page, inet_pton was being called incorrectly and not returning a valid sockaddrin[6]. 2008-02-27 Fabio M. Di Nitto [CMAN] Move ccs config ais module into ccs/ccsais 2008-02-27 Ryan McCabe Fix bz434790 2008-02-27 Christine Caulfield Merge branch 'cman3' [CMAN] Remove deleted nodes from our list Detect nodes that have been deleted from CCS and remove them from our list if they are dead. [CMAN] Don't ignore cman_tool version the processing of the 'cman_tool version' command updates the config_version variable immediately rather than waiting for the message to come back. Because of this, when the message /does/ come back it gets ignored because cman thinks it already has the latest config! The solution is, of course, not to update the config version until we receive the RECONFIGURE message. 2008-02-26 Lon Hohberger * Make fence_ack_manual.sh accept -n * Ensure fence_rps10 defaults to reboot * Make clustat not display estranged nodes which are now offline 2008-02-25 Lon Hohberger Correct incorrect netmask handling in ip.sh 2008-02-22 Chris Feist Test git commit. Removed newline. Test git commit. Added date. (test git commit) 2008-02-22 Serge Dubrouski Corrected status function to properly support more PostgreSQL instances. Some cleanups to improve the readability. --HG-- extra : convert_revision : 527d1843faeb23cbf0b3e9934b7afe90c8754416 2008-02-22 Christine Caulfield cman3 commit Separates out ccs interaction from cman itself, so we can replace the configuration back-end easily. 2008-02-21 David Teigland updates 2008-02-21 akamatsu.hiroshi@yes.nttcom.ne.jp RA mysql: fix handling of OCF_RESKEY_enable_creation --HG-- extra : convert_revision : 95d36b1d2beac8cd33a92a01ee741b227cd7fa4f 2008-02-21 Christian Rishoj RA mysql: replace == with = in test --HG-- extra : convert_revision : 38954d0d5d438e5e507e1731965d469ab24e40ec 2008-02-21 David Lee init: autoconfiscate an OS-specific pathname --HG-- extra : convert_revision : 654cab744120f67e0371b9836eed0e85d9900f12 2008-02-20 Fabio M. Di Nitto Sync missing commit from RHEL5 branch: Dmitry Monakhov from OpenVZ linux kernel team reports about wrong locking order in gfs_get_parent(). Patch submitted by Vasily Averin (vvs@sw.ru) under Red Hat bugzilla 400651 2008-02-20 Lars Marowsky-Bree Merge upstream changes. --HG-- extra : convert_revision : 2c904f58b59a4e3c2baca3eb9c10e72a0b30cc96 OCF RAs: Make which quiet if it doesn't find the binary. --HG-- extra : convert_revision : 17729a25feb72aa1c550999e8a9c6aa943311bfc 2008-02-19 Stephan Berlet IPaddr RA: fix parsing loopback interface for lvs support --HG-- extra : convert_revision : 806faf36d956e5d246cbe56c08d59a2c114ef051 2008-02-19 David Lee configure: determine a 'mail' that can offer '-s subject' --HG-- extra : convert_revision : aa8e7bf027b49c92d27a1246ecb0b01202cf6e7a resource MailTo: use the 'mail' program already determined by configure --HG-- extra : convert_revision : 3c3847e836b4dcc9c5dc833b680955701d56c318 reboot() system call: configure-time detection of number of arguments --HG-- extra : convert_revision : cd50ff4e9cc3380eeb18793128222797b08c72f2 2008-02-18 ccaulfield Allow unnamed parent objects. This fixes a bug where entries appeared under the top-level rather than the clusternode. 2008-02-15 Fabio M. Di Nitto Fix http://bugs.debian.org/465790 2008-02-12 Andrew Beekhof Low: RA: mysql - Add defaults for Gentoo and Debian (patch from Narayan Newton) --HG-- extra : convert_revision : 438288b87c9afc6cd8a401163236a1914210afa8 2008-02-12 Fabio M. Di Nitto Clean up qdisk man page. -\fB...>\fP is not a keyword (starts the line with .) make the man page more consistent with use of /> Man page cleanup. .SP is not a keyword. Change it to .SH. Cleanup man page. Lines starting with "." are man keywords. Just a plain space in front to make it clean. Stop linking against unrequired libraries. A lot of small tools were linking against a bunch of libraries for no reasons. Clean them up as much as possible as static linking is not spotted automatically. Problem spotted by some Debian automatic test tools and reported by Frederik Schüler 2008-02-11 Lars Marowsky-Bree Up version to 2.2.0 to underline the new development line. (Possibly even 3.0 would be appropriate?) --HG-- extra : convert_revision : 6ed21d3aeb01a0593cf2ec0d5a5c5ed706475b2b 2008-02-11 Andrew Beekhof High: CRM: Remove the CRM from Heartbeat as it is now maintained as Pacemaker Since December '07 the CRM has been a separate project. The existance of the code in two places can only cause confusion, thus it is being removed by it's sole author. --HG-- extra : convert_revision : cb22c50b3ffa65f3347d8e8649194d9747f33eee 2008-02-08 Dejan Muhamedagic RA: fix meta-data to validate against the dtd --HG-- extra : convert_revision : b2d7b6fb1970d14f599e2dc0eea239410e21d84f 2008-02-08 Patrick Caulfield Implement a nicer way of getting the quorum disk information. The libcman API remains the same but the connection to cman itself works using the normal GETNODE call. 2008-02-08 Simon Horman build: Quote $new_libnet I'm not sure if this can ever occur in practice, but if $new_libnet was ever empty, a syntax error would occur without quoting. --HG-- extra : convert_revision : 860ccc31c997917d6c4fff4947db5c2b0853e0c7 2008-02-07 Lars Marowsky-Bree RA: SAPInstance updated to 1.90 from SAP. SAPInstance 1.90 ================ - Fixed function check_sapstartsrv. It did not return and loops endless, if the sapstartsrv process dies immediately after restart. SAPInstance 1.80 ================ - New method recover provides calls to restart a crashed SAP instance using OS kill command and cleanipc. - New Parameter AUTOMATIC_RECOVER to execute the new recover method automaticaly once, if the start method fails. - New start logic to improve response times during start method - POST_START_USEREXIT is now only called on successful start. - status and monitor methods are now able to monitor SAP WebAS Java stand alone systems (no ABAP). --HG-- extra : convert_revision : cf2b65dd9dd6275343d361b7024719315484b8d0 RA: Update SAPDatabase to 1.90 from SAP. SAPDatabase 1.90 ================ - Fixed function db6udb_recover. It now calls the correct database commands to recover from crash. - Fixed tempfile name, to allow multible database instances on one node. SAPDatabase 1.80 ================ - New method recover provides calls to restart a crashed database instance (all types: ORA,ADA,DB6). Oracle will also recover from aborted online backup. - New Parameter AUTOMATIC_RECOVER to execute the new recover method automaticaly once, if the start method fails. - Fixed problem in detecting MaxDB owner. Now reading from /etc/opt/sdb. - POST_START_USEREXIT is now only called on successful start. SAPDatabase 1.75 ================ - New Parameter JAVA_HOME for SAP WebAS Java systems (without ABAP stack). To set JAVA_HOME to SAP user specific directory, in case root uses another java version. - New Parameter STRICT_MONITORING to activate and deactivate application level database montoring (R3trans). With Oracle it is usefull not to monitor on application level, otherwise a failover will occure in case of an archiver stuck. - Fixed missing x_server start in function maxdb_stop. --HG-- extra : convert_revision : e9cc2a4fad1643809aea7c3566b49c553f246039 2008-02-06 Jonathan Brassow - Bug 431705: HA LVM should prevent users from running an invalid setup (2) - better checking for improper setup -- this time for presence of fail-over VG in the volume_list - better checking for improper setup -- this time for presence of fail-over VG in the volume_list 2008-02-05 Christian Rishoj RA mysql: create pid/socket directories if needed --HG-- extra : convert_revision : 7c621d3fbb4dc04e5768bae93514a8412fc21017 2008-02-04 Dejan Muhamedagic RA oracle: backup changeset e2ef2b0879dd check_binary looks for executables as root. That won't do here. --HG-- extra : convert_revision : 6575926eb46e66c625da05e3ba89519d2d6db5af 2008-02-04 Patrick Caulfield Change a log_printf() into a syslog() so that the die message always arrives in the log. 2008-01-30 Lon Hohberger Fix short read handling in read_pipe Make fenced's override wait time configurable. Make default TTL 4 instead of 2 per Fabio's recommendation (e.g. RFC2608). Make TTL configurable in cluster.conf/command line for fence_xvm. 2008-01-30 Patrick Caulfield Oops. a bit too much cman3 fell into that last checkin Improve startup error checking and logging. 2008-01-30 Fabio M. Di Nitto Whitespace cleanup Remove unrequire functions. This follow gfs2 changes Bugzilla 227892: * Warn people about the RG corruption and request a gfs_fsck * Upon error detection, perform a minimum error data collection Port forward patch from RHEL5 branch to HEAD. Original author: Wendy Cheng Red Hat bugzilla 244343: GFS supports two modes of locking - lock_nolock for single node filesystem and lock_dlm for cluster mode locking. The gfs lock methods are removed from file operation table for lock_nolock protocol. This would allow VFS to handle posix lock and flock logics just like other in-tree filesystems without duplication. Port forward patch from RHEL5 branch to HEAD. Original author: Wendy Cheng 2008-01-29 Dejan Muhamedagic build: set MGMT_DIR in configure --HG-- extra : convert_revision : 503bf079f914b075382ba1d71ed7f1ee6eac74b9 2008-01-28 Fabio M. Di Nitto Remove obsolete file Bump kernel check to 2.6.24 Fix build warning Remove unused variable fix gfs for the removal of sendfile and helper functions Sendfile and helper functions have been removed in 2.6.24. Migrate to using splice_read and generic_file_splice_read helper function. Update gfs to cope with 2.6.24 export op changes and other bits EXPORT_SYMBOL(xtime) has been removed in 2.6.24. Let's use the exact same value (tv_nsec) just from another source. Update gnbd kernel modules to build with 2.6.24 2008-01-26 Robert Peterson Resolves: bz 223660: man gfs2(8) refers to the gfs2_mkfs manpage 2008-01-25 Lon Hohberger Fix qdiskd master abdication logic (#430264) Fix #430272, #430220 2008-01-24 Robert Peterson Resolves: bz 429633: gfs_tool doesn't recognize GFS file sytem 2008-01-24 Benjamin Marzinski Fix for bz #426291. gfs_glock_dq was traversing the gl_holders list without holding the gl_spin spinlock, this was causing a problem when the list item it was currently looking at got removed from the list. The solution is to not traverse the list, because it is unncessary. Unfortunately, there is also a bug in this section of code, where you can't guarantee that you will not cache a glock held with GL_NOCACHE. Fixing this issue requires significantly more work. 2008-01-24 Lon Hohberger Unblock signals after fork() so heuristics using signals don't hang 2008-01-23 Dejan Muhamedagic stonith ipmilan: revived - builds and works with openipmi 1.4 and 2.0 - really checks if the IPMI connection works on status - ipmilan is now enabled by default - tested with a Qlogic BMC - to make it work with openipmi 1.3 would require a series of #ifdefs with separate code; not impossible, but not sure if there's sufficient demand to justify the effort Many thanks to numerous contributors. And to people on the list who kept bringing up the issue with the IPMI support. --HG-- extra : convert_revision : ec3142990e140a41a85fe19ccfca0ac43eed9f2b 2008-01-21 Lon Hohberger Fix ccs connect error handling 2008-01-21 Jonathan Brassow - ccs library now checks for bad file descriptors as input 2008-01-21 David Teigland bz 429546 Fix an alignment problem with ppc64. Things work if we do the byte-swapping on the original structure and then copy it into the final buffer, instead of copying first and then trying to do the byte-swapping at an offset within the send buffer. 2008-01-21 Fabio M. Di Nitto Add fake support for -r option at umount so we don't fail if gfs2 is not umounted by it's init script. 2008-01-18 Lon Hohberger fix 429248 2008-01-18 Abhijith Das fix for bz333961 - adds support for -n and -f mount options lon's patch removes 'Domain-0' check which was breaking xvm because cman starts before xend. patch also allows you to put NODENAME in /etc/sysconfig/cluster 2008-01-18 Lars Marowsky-Bree Merge dev branch will local changes. --HG-- extra : convert_revision : 467aeae8bded6dfd280094b7660e065f86940dd4 2008-01-18 Andrew Beekhof Medium: Build: Only add files that we intend to build/package to AC_CONFIG_FILES() --HG-- extra : convert_revision : b8367b4ee2fa9555806bd5fc253ed7d1630ac0f3 2008-01-17 Lars Marowsky-Bree Low: Fix EvmsSCC meta-data. --HG-- extra : convert_revision : 07d81a1f9ce62e404c2c857b521f5a94b1f3f3cd 2008-01-17 David Teigland odds and ends not commited 2008-01-17 Lars Marowsky-Bree Novell 263195 - SAP Resources Agents in Heartbeat2 need to provide hooks for user customization Technically objectionable, but I bow to the demands of the customers. --HG-- extra : convert_revision : 7e7ff0d31af1d66e4dfd7f32905bfedfa215485c 2008-01-17 Lon Hohberger file oracledb.sh was initially added on branch RHEL5. 2008-01-17 Dejan Muhamedagic RA IPaddr2: provide default for OCF_RESKEY_CRM_meta_clone (thanks to Yves Schumann) --HG-- extra : convert_revision : 8ea748195889040005906137612610bb8cbf8d29 2008-01-16 Lon Hohberger Fix #60 error in #428346 bug 2008-01-16 Patrick Caulfield Zero namelen when doing an unlock. On 32/64 bit systems it can make a horrible mess otherwise. 2008-01-15 Ryan McCabe Allow "option=(on|off|reboot)" (currently only fence_ilo takes "action") 2008-01-15 Robert Peterson Fixup contributed by Andy Price. 2008-01-14 David Teigland fix %llx printf warnings using (unsigned long long) 2008-01-14 Fabio M. Di Nitto Allow the resource to run on Debian/Ubuntu systems without manual patching, by checking for the apache2 daemon if httpd is not available. Replace =~ '^/' sintax with less bash dependent version. Use grep -E instead of -P as perl regexp support is not built on all distros. 2008-01-13 Andrew Beekhof Low: Build: dopd can now be built without the crm --HG-- extra : convert_revision : 64eda52ccc32a43da2766033dfacf5e924d95ef8 2008-01-11 Jonathan Brassow - Bug #428448 - HA LVM service fails to relocate when I/O is running Was failing to add new tag when relocating. 2008-01-10 Patrick Caulfield Add command-line override for 2node mode. Because of the way cman re-reads CCS it is quite possible to start up a cluster in 2 node mode manually, then add a third node via CCS (I think) 2008-01-10 Fabio M. Di Nitto Whitespace cleanup Fix alignment issues in rgmanager. This makes it possible to run rgmanager on sparc. Patch by Lon. Tested by both of us on x86, x86_64, parisc, ia64, sparc. 2008-01-09 Robert Peterson Resolves: bz 426670: GFS2: man page for gfs2_tool has commented lockdump section 2008-01-09 Fabio M. Di Nitto Fix mkdir invokation to not fail when dir already exists 2008-01-08 Lon Hohberger Roll back previous patch to ip_lookup.c Fix build problem reported by Chris Feist 2008-01-08 Simon Horman configure: consistent whitespace in log message Signed-off-by: Simon Horman --HG-- extra : convert_revision : 0e12db8e85f62793cc1000aab1b3f0b7d46c3ecd 2008-01-08 Fabio M. Di Nitto Fix "off the source tree" install. This was a small regression introduced with the /etc/cluster/cluster.conf configure bits. 2008-01-08 DAIKI MATSUDA OCF-resources: SphinxSearchDaemon is not installed Hi, All. Congratulations to release Heartbeat 2.1.3. I read Alan's release mail and made own rpm package. So, I was aware of difference that SphinxSearchDaemon RA is not installed. And I attached the simply patch file and please confirm. Cc: DAIKI MATSUDA --HG-- extra : convert_revision : b143f7c497816922783be3294320414fc5d99f76 2008-01-08 Simon Horman IPv6addr: Check for netinet/icmp6.h instead of linux/icmpv6.h Configure currently checks for asm/types.h+netinet/icmp6.h, however this check fails on debian ia64. Changing the check to sys/types.h+netinet/icmp6.h resolves the problem for Debian, but breaks RHEL 4. http://developerbugs.linux-foundation.org/show_bug.cgi?id=1660 This revised check looks for sys/types.h+netinet/icmp6.h which is more or less what resources/OCF/IPv6addr.c actually uses, so hopefully this check keeps all the relevant parties happy. Tuomo, could you check this patch and see if it causes you pain? Cc: Alan Robertson Acked-by: Tuomo Soini --HG-- extra : convert_revision : 45a1c405402db63bfe253cf75f5b0cc13f1ae590 2008-01-07 Lon Hohberger Figure out where slang is installed. Correct signed vs. unsigned comparison on sparc64 2008-01-07 Dejan Muhamedagic shell scripts: various cases of overquoting in the for statement (thanks to Takekazu Okamoto) --HG-- extra : convert_revision : 667542dd39f205eaa984899b60b2e12750733baa 2008-01-07 Fabio M. Di Nitto makes it possible to change the default configuration file by setting --confdir (default to /etc/cluster) and --conffile (cluster.conf). NOTE: manpages with hardencoded /etc/cluster/cluster.conf are not updated. If you dare to change these defaults you know what you are doing. NOTE to developers: you will need to re-run ./configure to set the new vars. 2008-01-04 Jonathan Brassow - a regression... When tagging at the LV-level, the script should complain if there is more than one LV / VG. 2008-01-04 Fabio M. Di Nitto Fix clean target. core files have pid attached to them. 2008-01-03 Jonathan Brassow lvm resource script now allows multiple LVs per VG as long as they move together (exist on the same machine). s/validate/verify/ BUG 427377 HA LVM now allows multiple LVs/VG as long as they move together Package builder, please note the addition of 3 new files. 2008-01-03 Patrick Caulfield Get rid of redundant totemip_parse() call. This was in a bad place and could cause aisexec stalls and disallowed nodes, particularly at startup. 2008-01-03 Fabio M. Di Nitto Fix buffer align. So far this one makes the entire stack run on sparc up to fenced. 2008-01-02 Lon Hohberger Fix endian issue on big-endian arches 2008-01-02 Patrick Caulfield Use define CMAN_NAME for the purpose for which it was intended Lets see if I can do this commit properly... Fix swab of an int to be swab32 rather than swab16 2008-01-02 Fabio M. Di Nitto Cleanup manpages to work with whatis. Patch from Frederik Schüler Add interpreter to ocf-shellfuncs. Patch by Frederik Schüler 2008-01-02 Patrick Caulfield totempg_ifaces_get() always copies INTERFACE_MAX addresses so make sure we alloate enough space for them all. 2007-12-31 Fabio M. Di Nitto aisexec config parser expects error_string to be set also when we successfully read the configuration. 2007-12-30 Fabio M. Di Nitto Fix error reporting to aisexec. aisexec expects an error_string string set by config_read if config_read fails. The lack of error_string is not checked by aisexec that will segfault. Set error_string properly and clean up the old errorstring that is not used. Fix building when -DDEBUG is defined. Fix build with -DDEBUG 2007-12-26 Simon Horman hbmgmtd: use /etc/pam.d/common-{auth,account} if available I'm not sure if this is a debianism or not, but this solution seems general enough. Closes Debian Bug 444371 See: http://bugs.debian.org/444371 See: http://62.147.165.84/cgi-bin/dwww/usr/share/doc/libpam0g/Debian-PAM-MiniPolicy.gz --HG-- extra : convert_revision : ae486d2c283a0572700cbcece94a8398214d7118 2007-12-24 Fabio M. Di Nitto Once again change ifdef to fix fail to build on hppa/parisc Fix all: target. 2007-12-22 Fabio M. Di Nitto Fix gnbd build dependencies. For too long we did rely on gnbd/Makefile to build in the right order but single builds were broken. Fix fence build dependencies. For too long we did rely on fence/Makefile to build in the right order but fence_tool and fence_node were just broken. Fix a few regressions introduced by the big Makefile clean up: - restore all: target as default. Libraries need a small special casing in the include / target order due to var expansion. - fix udev uninstall target: typo in make/uninstall.mk and requires (for sake of simplicity 51-dlm.rules in /lib instead of /script. - gfs/Makefile and fence/agents/Makefile don't need passthrough.mk. - Fix uninstall of symlinks. - Fix uninstall of rgmanager resources. Collapse all install: and uninstall: targets in make/install.mk make/uninstall.mk Change almost all Makefile's to use them. Convert to use make/clean.mk Remove unrequired distclean targets 2007-12-21 Fabio M. Di Nitto Collapse all common clean: target bits into make/clean.mk generalclean: target. Update all relevant makefile's to use generalclean. Required by this change: all TARGETS need to be defined before sourcing *.mk files to allow simpler var expantions. update all makefiles as a consequence apply alpha sort :) Remove obsolete and unused Makefile Install forgotten dlm man pages. Collaps all man Makefile's common snippets into man/man.mk Change all man Makefile's to use it. From now on it will be enough to source make/man.mk and add a TARGET= with the man page. A long time ago we did start collecting common Makefile snippets in one location. This time we shrink all common objects rules in make/cobj.mk Use newly defined $(OBJDIR) to source .mk files snippet. This reduced the hardcoded paths in Makefiles to one to include defines.mk and makes it easier to change stuff across the tree in one shot. Fix clean target Minor objdir rework to extend flexibility. the first shot of objdir implementation implied that you did: cd cluster ./configure --objdir... etc. cd /path/to/objdir make now you can: mkdir objdir cd objdir /path/to/configure make NOTE: in this case you don't need to specify objdir. it will be automatically set up for you. As a nice side effect you can also be anywhere on the fs and do: /path/to/configure --objdir=/path/to/obj cd /path/to/obj make and it will work. Fix 2 corner cases when setting up the objdir: - do not symlink symlinks.. this allows to setup multiple objdirs within the source tree. - do not copy/symlink defines.mk from other trees or bad thing happens. 2007-12-20 Fabio M. Di Nitto * globally rename BUILDDIR to SRCDIR to reflect what it really is. * top level configure: - add --objdir=/path option (default to current tree - it does not change current behaviour if not specified and different from current tree) - add a perl subroutine to handle symlinks - check and setup /path - switch all libdirs to use the objdir directly - propagate objdir to make/defines.mk NOTE: those changes require 2 perl modules that should be available everywhere: Cwd 'abs_path'; and File::Basename;. NOTE2: you will need to rerun ./configure after applying the patch. * make/defines.mk.input: - suck in OBJDIR from top level configure - define THISDIR as a relative path from the top level. For example: if PWD /usr/src/cluster/cman/lib, THISDIR will contain cman/lib. - define S as full path to SRCDIR/THISDIR/ * all Makefiles: - convert includes path to use $(S) and always point to the source as defined in make/defines.mk. - fix all object generation targets to use $(S). - fix all install targets to $(S) where required. * Random cleanup: - ccs/daemon/Makefile: do not kill files that are not around. - dlm/tool/Makefile: use ${dlmincdir} and $(SRCDIR)/group/dlm_controld/ instead of relative paths and do not include itself as there are no header files. Also use ${dlmlibdir} for linking. - fence/Makefile: change build order as this is required to avoid extra hacks due to shared C files around. - gfs/gfs_fsck/Makefile: fix symlinking to be relative and not absolute or the symlink is useless. - gfs2/fsck/Makefile: likewise. - gfs2/mkfs/Makefile: likewise. - gnbd/utils/Makefile: create bits. This is to avoid even more hacks than fence/Makefile to cope with shared objects. One day we will need to review all of this. Cleanup leftovers from the very old build system. We were using a very complex way to set release_major and release_minor because in the old system it was not possible to set them directly from ./configure. Remove the old cruft since ./configure can now take those values directly in input. 2007-12-19 Alan Robertson hg: pulled 'dev' changes from Dejan (and one from Alan) into 'test' --HG-- extra : convert_revision : b814d1c59069ab52c0ad4b2ad4b40c9b9a8104ce 2007-12-19 Lon Hohberger Allow soft dependencies when central_processing is enabled Fix #254111 - when stopping a service using a shared GFS resource, it umounts it even if other services are using it. fix typo in clusterfs.sh 2007-12-19 Serge Dubrouski pgsql RA: postmaster confusion --HG-- extra : convert_revision : b37a33e67cb8d51ddbeb239696dda04813219d21 2007-12-19 Fabio M. Di Nitto Fix extracflags and extraldflags to be recognized as options or configure will fail. 2007-12-18 Alan Robertson hg: merge - brought over changes from 'test' - I think this _should_ be the final set for 2.1.3 --HG-- extra : convert_revision : d0eacce50c0988aa5b855445b09ba4271ec895e3 2007-12-18 Hideo Yamauchi apache RA: remove children in case the top process didn't --HG-- extra : convert_revision : 424db20a45853905fe58d24e79fa7ffc8e423107 2007-12-17 Dejan Muhamedagic Xen RA: new improved (thanks to Sebastian Reitenbach) New features: - memory management - DomU migration - possibility to monitor services within the DomU --HG-- extra : convert_revision : 559d5b05aff7885f8a7cf53f6bd73aa18b89c33d 2007-12-14 Lon Hohberger Fix misc central events bugs. Add return value for inability to run due to exclusive flag being present 2007-12-14 Fabio M. Di Nitto Make sure we invoke virConnectOpen with a proper URI. NULL is deprecated in libvirt and we have no control over distro defaults that might not be xen:///. Patch by Soren Hansen 2007-12-13 Alan Robertson hg: branch merge - brought over changes from 'dev' --HG-- extra : convert_revision : 79125c1dc2cf7cfc630dc31c4ff6dde911c9262a Junko IKEDA reported a problem building on Red Hat with tog-pegasus installed - this fixes that problem - the code half-setup things if pegasus was present but --enable-cim-provider wasn't enabled Risk: low Importance: medium --HG-- extra : convert_revision : 3914fa415bd094f47dfc331309ea2ecb3547e6b9 2007-12-13 Patrick Caulfield Allow rrp_mode to be overridden in cluster.conf 2007-12-13 Dejan Muhamedagic pgsql RA: handle the missing pg software properly (thanks to Serge) --HG-- extra : convert_revision : 2381b9ab11f9d094af1e2e0024a734415dd24f0b 2007-12-13 Patrick Caulfield Fix altname option 2007-12-12 Alan Robertson LF bugzilla 1667: ppc64 RPMs contain 32-bit binaries debltc patch to Makefile and configure.in to force 32-bit package risk: low importance: critical to PPC users --HG-- extra : convert_revision : 807f58320ac6574531439b01cd9aabdbef98b6f8 2007-12-12 Lon Hohberger Misc. minor central processing bugfixes 2007-12-12 Dejan Muhamedagic build: my configure.in silliness from 0890907b816f (sorry) --HG-- extra : convert_revision : 9ea392d91f2cf108820296e22303f42a8737e971 2007-12-12 Lon Hohberger Add missing ds.h 2007-12-12 Alan Robertson RPM and specfile changes to fix part of and help diagnose part of Junko IKEDA's problem with tog-pegasus These changes should be harmless to any working CIMOM configuration. --HG-- extra : convert_revision : 40e94bf2f937db56548d2837d7f02053283365f1 2007-12-12 Dejan Muhamedagic iscsi RA: wait for udev to create the links --HG-- extra : convert_revision : 0bbb0427ef188462adc9897275ab3f4e951597ad 2007-12-11 Alan Robertson hg: merge - pulling changes from 'dev' to 'test' --HG-- extra : convert_revision : 07d75b76fb187649a77c3733510f04c2dfae5c22 2007-12-11 Lars Marowsky-Bree Filesystem: Make mount point required for the GUI (Hideo Yamauchi) --HG-- extra : convert_revision : 34f9a94b9d7cee67163c4b31ccda8aa861633d13 2007-12-11 Ryan O'Hara Fix issue with endian conversion that caused problems for mixed architecture nodes on same subnet. Need to correct swap byte ordering of comm_header_t structure before copying into buffer and when dereferencing. 2007-12-11 Alan Robertson hg: merge from 'dev' into 'test' --HG-- extra : convert_revision : b01154bf5727773f3f7eecb0c252572cd66cbac8 2007-12-11 Patrick Caulfield Some small fixes to the networking param code, thanks to Fabio on IRC 2007-12-11 Dejan Muhamedagic hg push test --HG-- extra : convert_revision : 98ac94933fe5c02a7d6a9867df0da825504bd6bb hg push test --HG-- extra : convert_revision : ed5e6a487dbd3919149ee7dc1e7d8d16050cd45c 2007-12-11 Patrick Caulfield Set networking parameters suitable for running DLM over sctp Tidy comments 2007-12-11 Dejan Muhamedagic build (LF 1803): add -fgnu89-inline to CFLAGS if gcc version is >=4.1.3 and <4.3 --HG-- extra : convert_revision : 0890907b816ffcc300bfa177d7f99d8455a2d73b 2007-12-10 Dejan Muhamedagic pgsql RA: move check for root down to enable meta-data for regular users (thanks Serge) --HG-- extra : convert_revision : 6f74673bc59dfe033797e70239c6bdd4c0440a8b 2007-12-10 Lon Hohberger Fix type-punned errors on i386 2007-12-10 Patrick Caulfield Add multi-path capability. Each address we get from cman is now passed into the DLM. It's still incumbent on cluster.conf to set the transport to sctp. 2007-12-10 Alan Robertson hg: branch merge - pulled changes from 'dev' --HG-- extra : convert_revision : 614e0a83be4bead4cf6a81c53463859d8cec7ef4 lf bug 1678: stonith command core dumps with -h option severity: serious - also keeps metadata from being collected from any STONITH agents risk: very small NOTE: This bug fix also includes two changes which get rid of superfluous output One it raises the debug level required for it (it's REALLY verbose with debug=1 now) The other gets rid of something which was really debug output, even though it was logged as "info" output. --HG-- extra : convert_revision : 11a5870030d1cd330871bc7374e1f3637968ded1 2007-12-10 Dejan Muhamedagic RA iscsi: get rid of one annoying message ("no active sessions") --HG-- extra : convert_revision : bbc1edd0d1ebea76d74d874a89742bba557fcd0a 2007-12-10 Alan Robertson hg: brought over changes from 'dev' --HG-- extra : convert_revision : 41f8a38668c55d58b9b3566a8767a9e4edff8c83 OpenBSD bugs: 1731 1761 1743 1659 risk: believed to be low - nearly all changes are in OpenBSD specific code --HG-- extra : convert_revision : 070f8d6814e212d641a9de03e0a7ad27958a3ae4 2007-12-08 Alan Robertson Corrected specfile weirdness - RPMREL wasn't 1 --HG-- extra : convert_revision : 3eb9ae52d0062e9db4dfc8146c812d992ea1184c 2007-12-07 David Teigland new plock ownership related stuff 2007-12-07 Lon Hohberger Make S-Lang library & include paths configurable. NOTE: You MUST rerun configure after applying this update or rgmanager will no longer build. 2007-12-06 Lon Hohberger Fix format warnings on newer GCC Add missing sets.h 2007-12-06 Patrick Caulfield Add option to disable kernel_check. From Soren Hansen: "It's handy to be able to disable the new kernel version check if we don't actually have the kernel headers around, but know that the proper stuff is around when it's needed." 2007-12-05 Lon Hohberger Preliminary GFS2 support in clusterfs.sh 2007-12-05 Patrick Caulfield Print votes of quorum device in cman_tool status 2007-12-05 Alan Robertson hg: branch merge - pulling from 'dev' --HG-- extra : convert_revision : 17046c58d625d5a48fc83513804447a49df13f48 2007-12-04 Lon Hohberger Merge force-unmount from RHEL5 branch for netfs.sh script 2007-12-04 Ryan O'Hara BZ 323111 Remove permission() checks from xattrs ops. 2007-12-03 Dejan Muhamedagic RA Filesystem: add lustre to the list of fs which don't need fsck --HG-- extra : convert_revision : 90e953bdaab05466b74beaa8e7617d62396ec5e9 2007-12-03 Alan Robertson Merged in changes from 'dev' into 2.1.3 test... --HG-- extra : convert_revision : 6735dbc3a635cf8ff4658c5d4a7bb9859d4f4567 2007-12-01 Dejan Muhamedagic RA iscsi: support iscsi sles compat mode --HG-- extra : convert_revision : 86333d3a937780d9290d7bd4c8698e8e4baf6374 2007-11-30 Lon Hohberger Add centralized S/Lang event script engine v0.8.1 Merge from RHEL5 branch, pass 3 Merges from RHEL5 branch - round 2. Merges from RHEL5 branch - round 1. 2007-11-30 David Teigland change some log messages 2007-11-29 David Teigland Testing revealed a couple more races I hadn't expected. 2007-11-29 Dejan Muhamedagic RA iscsi: fix meta-data xml --HG-- extra : convert_revision : 78390dbfb632c4d3e46eb03dd9118b72f25b82f1 2007-11-29 Robert Peterson Resolves: bz 325151: GFS2: gfs2_fsck changes to system inodes don't stick 2007-11-29 Dejan Muhamedagic packaging: include iscsi --HG-- extra : convert_revision : 036479adfffeff99b18d1d065636e1a395b8c43c RA iscsi: dropped complex code in favour of udev and several small cleanups --HG-- extra : convert_revision : 27f6de79ea020414f44c9afe287680335f21d8e9 RA SAPDatabase/SAPInstance: report proper exit codes in case required software is not installed --HG-- extra : convert_revision : c6fb08b029381f83728505d17b0aeecc0233069d 2007-11-29 Patrick Caulfield Reinstate 'cman_tool join -X', allowing people to start a cluster without the hassle of a cluster.conf file. There are some caveats to this, which are mentioned in the man page. 2007-11-28 David Teigland A performance optimization for plocks. This speeds up locks that are repeatedly accessed by processes on a single node. Plocks used by processes on multiple nodes work the same way as before. The optimization is disabled by default, and can be enabled by setting in cluster.conf, or by starting gfs_controld with "-o1". It is disabled by default because enabling it breaks compatibility with previous versions of gfs_controld. If all nodes in the cluster are running this version, then plock_ownership can be enabled. The plock_ownership mode needs extensive testing. This also introduces some minor changes when plock_ownership is disabled, so new testing is also required in that mode. Abhi and I worked on this together. 2007-11-28 Robert Peterson Resolves: bz 402971: GFS2: gfs2_edit savemeta doesn't save rindex file. 2007-11-28 Dejan Muhamedagic RA iscsi: two echo statements removed --HG-- extra : convert_revision : 436fed03563ca4768320edfe55122581dcaa3925 RA iscsi: new iscsi resource agent --HG-- extra : convert_revision : 3a3a910aa4ac21f03c4541921dcfc0aae3921baf RA mysql: don't look for mysql users in the OS database (thanks to Christian Rishoj) --HG-- extra : convert_revision : 289ab2304e76288a93b25dd2ec2ac3d465ab0321 2007-11-27 Fabio M. Di Nitto Do not install stripped binaries 2007-11-27 Dejan Muhamedagic RA SphinxSearchDaemon: a new RA thanks to Christian Rishoj --HG-- extra : convert_revision : 9dee52b29c4e695ce9527bd05fa24311259f8e95 2007-11-24 Fabio M. Di Nitto Switch configure to use perl warnings and fix them up. Add kernel_version version check subroutine. Set minimal kernel version requirement to 2.6.23. Many thanks should go to Marian Marinov for the original patch and contribution. 2007-11-21 David Teigland ASSERT was doing fprintf(stderr) which goes somewhere we don't want when running as a daemon. 2007-11-21 Alan Robertson LF bug 1690: there should be a tool to audit (validate) node names in constraints among other things... Bug URL: http://developerbugs.linux-foundation.org//show_bug.cgi?id=1690 severity: enhancement risk: near-zero utility: high --HG-- extra : convert_revision : 354a8b4f62132528a0d1d2f45e3a7e9899ca2cd8 2007-11-20 Patrick Caulfield Clear out the ports opened list when a node goes down. Thanks, Lon. bz#327721 2007-11-17 Robert Peterson Resolves: bz 382581: GFS2: gfs2_fsck: buffer still held for block 2007-11-15 Robert Peterson Add the "printsavedmeta" option to the gfs2_edit man page. gfs2_edit wasn't printing directory entries and extended attributes correctly. Added ability to save inode extended attributes in "savemeta". This is necessary for in order to test bug #382581. Added ability to recurse one level on directories when printing. Fixed bugs associated with traversing directory leaf blocks. Added ability to recognize and display log buffer and quota change blocks. Simplified code by breaking up display_indirect into two functions: one for indirect blocks, the other for directory leaf blocks. 2007-11-15 Fabio M. Di Nitto Fix bugzilla 362031 2007-11-14 Robert Peterson Resolves: bz 352841: GFS2: Evaluate and implement missing gfs2_tool features 2007-11-14 Dejan Muhamedagic RA Xinetd (LF 1742): multiple fixes - better parsing of services (thanks to Matt Zagrabelny) - don't rely on the pid file because not all distributions use the pid file - update meta-data to explain how xinetd should be started - prevent unnecessary noise for the monitor operation - replace HA_SYSCONF_DIR (non existing) with /etc --HG-- extra : convert_revision : ddbcbf6038a49566b2b6140ca9c8f6345ccee94b 2007-11-14 Fabio M. Di Nitto Hard encode paths to (u)mount.gfs* given the very nature of mount(8) api to look only in /sbin for (u)mount helpers, we can hard encode the install paths for our tools into the Makefile systems. I have never seen anywhere a different behavior in any Linux distribution and it will make packagers life simpler. Thanks also to Marc - A. Dahlhaus for spotting the missing DESTDIR entries. 2007-11-13 Robert Peterson Fixed printing of gfs1 journals. Resolves: bz 364741: GFS2: gfs2_quota doesn't work unless lock table specified 2007-11-12 Patrick Caulfield file msgtest.c was initially added on branch RHEL4. 2007-11-12 Fabio M. Di Nitto Be consistent across the entire tree on AR and RANLIB invocations The new Makefile system never invokes LD directly (and this is a good thing). Clean up STRIP usage. It is not consistent and we shouldn't strip random binaries. 2007-11-10 Alan Robertson LF bug 1772: apphbd needs to be able to have clients declare themselves critical resulting in a "fast fail" of the system when they fail priority: non-essential enhancement risk: low to very low --HG-- extra : convert_revision : 3b6986fcfdd97252e6e33d997456e0f3eba5095a 2007-11-09 Alan Robertson LF bug 1762: ComponentFail test results in: Resource ocf::IPaddr:rsc_ibm1 appears to be active on 2 nodes. IMPACT: critical; Risk: minimal Risk: minimal bug impact: critical - data would be destoryed http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1762 --HG-- extra : convert_revision : b7e98f2a80af6e06a1d97fb54f9a45c86b67e3bc 2007-11-09 Fabio M. Di Nitto Remove cman_wait_init for now. It was becoming overly complicated for such simple task. 2007-11-08 Jonathan Brassow file rbtree.h was initially added on branch RHEL5. file rbtree.c was initially added on branch RHEL5. 2007-11-08 Robert Peterson Printing the quota file wasn't printing its contents due to a bug. 2007-11-08 Patrick Caulfield Add an explanation of the node states shown by "cman_tool nodes" and some informastion about the "disallowed" state. 2007-11-07 Robert Peterson Resolves: bz 336561: gfs2_tool accepts jdata flag; man page says no Fix a divide by zero if the target isn't a gfs or gfs2 file system. Resolves: bz 352581: GFS2: implement gfs2_tool lockdump 2007-11-06 Robert Peterson Resolves: bz 354201: GFS2: gfs2_tool: unknown mountpoint on some mount points 2007-11-05 Fabio M. Di Nitto Add cman_wait_init as wrapper for cman_admin_init/cman_init and cman_is_quorate 2007-11-05 Patrick Caulfield Add missing format string. Enhance API to retrive just the quorum device information using cman_get_node() 2007-11-05 Dejan Muhamedagic RA tomcat: allow consecutive starts --HG-- extra : convert_revision : 2662edf4ee85fcf821d1d7644fcb070e29dc2b37 2007-11-02 Alan Robertson moderately serious bug, low risk: LF bugzilla 1757: apache resource agent grep methodology can't handle newlines - even if you change the pattern --HG-- extra : convert_revision : 0c518109e5c54c4bc86949a9633ff67b822d500f 2007-11-02 Dejan Muhamedagic mysql RA (LF 1760): defaults for OpenBSD (thanks to Sebastian Reitenbach) --HG-- extra : convert_revision : 4dd57293ecbae1057e2c3bc7aa2f8b243ad026c0 2007-11-02 Alan Robertson minor: trying to get ppc64 to build correctly... --HG-- extra : convert_revision : bd93c4f10b13e8c4db25f5a180f8696b8f457b0e minor: configure.in - autoconf spells ppc64 funny... --HG-- extra : convert_revision : 5f2bec73bbe2badb1395f8c9f84a1710c0713e76 minor: more RPM tweaks... --HG-- extra : convert_revision : 91a2709338c683310a8041a45fdce6254297aed6 minor: changed configure.in to force 64-bit objects when on gcc-based ppc64 platform --HG-- extra : convert_revision : a541eaa2da29f2de1a0b728c7531c3a174d43f3c 2007-11-01 Patrick Caulfield Enable to_stderr logging if 'cman_tool join -d' is used. 2007-11-01 Fabio M. Di Nitto If votes for quorumd is _not_ specified in cluster.conf, then automatically set votes to number of nodes - 1. 2007-10-30 Robert Peterson Resolves: bz 349601: GFS2 requires straightforward way to determine number of journals 2007-10-29 Fabio M. Di Nitto Apply, rework and cleanup second part of patch from Marco Ceci to fix 354421 2007-10-26 David Teigland xid needs to be unsigned long long don't setup netlink if deadlock is disabled 2007-10-26 Ryan McCabe patch from Marco Ceci to fix 354421 2007-10-26 David Teigland Improve the dumping of debug logs from daemons. bz 317181 group_tool reads debug logs from groupd, fenced, and gfs_controld. The dumping code in all three daemons is now identical. The other change is that the dumping function terminates the final write with \0, and no longer sends the entire 1MB log buffer if it's not full. 2007-10-26 Ryan McCabe Compile with -Wformat=2, which will catch usually dangerous format string bugs Keep gcc from reporting a bogus warning when compiling with -Wformat=2 rgmanager format string fixes More format string fixes Fix format string bug 2007-10-25 Robert Peterson Resolves: bz 337961: gfs_grow /mountpoint/ does not work Resolves: bz 345501: minor correction to previous commit. Resolves: bz 345501: GFS2: gfs2 utils uses non-canonicalized names 2007-10-24 Robert Peterson Resolves: bz #334481: gfs2_jadd man page refers to non-existent -T option 2007-10-24 Ryan McCabe Commit msg with the last commit went missing.. - Fix unsafe string handling: - replace memset(s,c,n);sprintf(s,...); with snprintf with proper error checking - don't overflow the stack if the cluster name specified in the env var is too long - don't overflow the stack if the local nodename from uname(2) is too long - don't overflow the stack if the local nodename specified in the env var is too long - Don't leak the ccs descriptor in get_ccs_join_info() on errors - Fix a couple of small memory leaks in error paths - Handle OOM conditions - Fix unsafe string handling: - replace memset(s,c,n);sprintf(s,...); with snprintf with proper error checking - don't overflow the stack if the cluster name specified in the env var is too long - don't overflow the stack if the local nodename from uname(2) is too long - don't overflow the stack if the local nodename specified in the env var is too long - Don't leak the ccs descriptor in get_ccs_join_info() on errors 2007-10-24 Fabio M. Di Nitto Use standard path var and memset it before each query Use right vars to print debugging info Clean up duplicate ccs query paths 2007-10-23 Fabio M. Di Nitto Fix purely cosmetic typo 2007-10-21 Alan Robertson hg - branch merge from upstream ('dev') --HG-- extra : convert_revision : ab96b5da5cb3a24f92392032398cda344543b687 rpm build change: allow openhpi to be optional, but default it to be required. Don't require it for SUSE < 10.2 --HG-- extra : convert_revision : c4267079327b85269afe494e068cba7e5259ee16 2007-10-20 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 89f29006b54dfda63a24f5d3d77d4a40e06fbed8 2007-10-19 Robert Peterson Resolves: bz 291551: gfs2_fsck clears journals without asking. Minor correction to the previous commit. Bopping through indirect pointers was inadvertently changing variable "block" during savemeta. 2007-10-18 Robert Peterson While working on bz #291551, I discovered that gfs2_edit savemeta only saved true metadata, but we need more than that. There are lots of blocks that are considered "data" (not metadata) by the RG bitmaps that we still need to save. These include: 1. All the system journals (which may contain both metadata and user data, disguised as data blocks within the journal). We need to pick out the parts that do not contain user data. 2. The system files, such as statfs, inum, quota file, etc. These may be helpful in debugging user problems. 3. Indirect block pointers, which may be counted as data for accounting purposes, even though it's metadata. 4. Directory leaf blocks. This change allows gfs2_edit to save and restore these blocks properly, and also to print out a breakdown of a saved metadata file: gfs2_edit printsavedmeta /tmp/gfsmeta It also improves on the information given when a journal is dumped. 2007-10-18 Andrew Beekhof Low: RA: PureFTPd - Support debian's pure-ftpd-wrapper script. Patch by Raoul Bhatia script is already a parameter, however this particular script has slightly different expectations. --HG-- extra : convert_revision : e775a0bb00884f33c4faa3526b8fc27403be910e 2007-10-17 Abhijith Das fix bz 311591 - make lock_dlm the default lock protocol in mkfs.gfs and mkfs.gfs2 2007-10-17 Lon Hohberger Include missing debug.h header file 2007-10-17 David Teigland used wrong define, DLM_LOCK_ instead of LKM_ 2007-10-17 Lon Hohberger Make fence_xvmd read options from ccs like it should Make fence_xvmd read options from ccs like it should; merge dbg_printf patch from RHEL5 branch 2007-10-17 David Teigland The output of 'dlm_tool lockdump' could make it appear that a granted lock was still converting because the rqmode reported by the kernel is not reset to IV when a NOQUEUE convert fails. 2007-10-17 Alan Robertson LF bug 1731: wrong location of libraries for OpenBSD 64Bit architectures, with patch My patch is slightly different, but should work the same. --HG-- extra : convert_revision : 5b4aad7a659808fe6bf8c3100dfcdc73a4600962 2007-10-15 Lars Marowsky-Bree RA: mysql: Allow arbitrary commandline arguments for mysqld (by Raoul Bhatia) --HG-- extra : convert_revision : 93e643470d0fb8f458256ec587b7dfd3e6abdd13 2007-10-15 Patrick Caulfield Make sure it compiles against latest openais trunk. 2007-10-15 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : d5e4fe546421d87d666896caded80c9de6741ba8 2007-10-15 Dejan Muhamedagic RA mysql, pgsql: use getent(1) instead of /etc/passwd (thanks to Raoul Bhatia) --HG-- extra : convert_revision : e5c2b1d937ac72407150d42506155b52637caeb1 2007-10-12 Robert Peterson Resolves: 235931: gfs2_edit command to set NOALLOC flag 2007-10-12 Andrew Beekhof Low: Build: Remove unused LDADD entries (libpils.la and libapphb.la) --HG-- extra : convert_revision : c71cad4454066a1745f88c832da5b78444696264 2007-10-12 Robert Peterson Add the ability for gfs2_edit to print gfs1 journals. 2007-10-11 Robert Peterson Resolves: bz 251180: Build time warnings for gfs2 userland tools Resolves: bz 295301: Need man page for gfs_edit Resolves: bz 240545 (addendum). 2007-10-11 Lars Marowsky-Bree RA: Xinetd: Fix stop/monitor to not fail if service isn't available yet. The monitor operation though is rather basic and probably should also verify whether the process is actually running, not just whether the pidfile exists ... If someone has Xinetd, I'd appreciate a fix. --HG-- extra : convert_revision : 788fbfe3444d00fbb6d34eb3e4f5f3c6355ae9e9 2007-10-11 Ryan McCabe E2BIG is more appropriate than ENOSPC here 2007-10-10 Lars Marowsky-Bree RA: Raid1: Allow the homehost setting to be specified. --HG-- extra : convert_revision : dbb905ef76945f107aa3d6d0c8ae46bfe8e854a0 2007-10-10 Ryan McCabe Allow valid addresses of nodes even if they're not identical to the way they're specified in cluster.conf Fix code that caused warnings on platforms where sizeof(void *) != sizeof(int) 2007-10-08 Robert Peterson Resolves: bz 247318: Need man page for gfs2_edit 2007-10-08 Ryan McCabe add new function to libccs: * ccs_lookup_nodename * @cd: ccs descriptor * @nodename: node name string * @retval: pointer to location to assign the result, if found * * This function takes any valid representation (FQDN, non-qualified * hostname, IP address, IPv6 address) of a node's name and finds its * canonical name (per cluster.conf). This function will find the primary * node name if passed a node's "altname" or any valid representation * of it. 2007-10-03 Marek 'marx' Grac Resolves: #250681 - mount samba share from netfs RA 2007-10-03 Patrick Caulfield Tidy logsys use. Is this OK now Steve ? 2007-10-01 Dejan Muhamedagic hb_report: rpm/install --HG-- extra : convert_revision : d4203fc7b082652ddb6f5445c62c4df3fb4527d3 2007-10-01 Ryan McCabe Fix 314091 2007-10-01 Patrick Caulfield Use "logger_subsys" & "subsys" keys rather than "logger" and "ident". 2007-10-01 Fabio M. Di Nitto Use proper vars to disable targets Fix configure to handle properly 0.x or x.0 releases. 2007-09-28 Patrick Caulfield Reinstate cman_tool services, which got lost inexplicably. Call "group_tool ls" for cman_tool services 2007-09-28 Fabio M. Di Nitto configure: Backticks don't work in strings. Use POSIX::uname(). The strings that use backticks to get `uname -r` don't work as expected. Let's use POSIX::uname() directly and let perl do the work. Patch by Joel Becker 2007-09-28 Simon Horman Reformat comment to make it less than 80 columns wide --HG-- extra : convert_revision : f358795ca51531d7d5df0b62664085a8460cecc0 2007-09-27 Patrick Caulfield Recalculate quorum based on the expected votes value of a new node. bz#308581 2007-09-25 Andrew Beekhof Medium: Logging: Sanitize and centrally define the log facility used by various subsystems In changeset 1f454f857ee8 Alan changed the CRM to default to LOG_DAEMON. Most other places also default to LOG_DAEMON, with notable exceptions being ha_logd and cts. So... make ha_logd and cts default to use LOG_DAEMON and define the value to use centrally in configure.in As a bonus, remove calls to hb->llc_ops->get_logfacility() which is redundant now that cl_inherit_logging_environment() exists. --HG-- extra : convert_revision : 08888fc12148ae0e73db4e7ece809e416a55f429 2007-09-25 Robert Peterson Resolves: bz 304001: GFS2: Filesystems with 1k block size won't mount 2007-09-24 Fabio M. Di Nitto switch permanently to perl -w fix all the warnings in the script. Thanks to Patrick for spotting an extra one. Add support to allow disable the build/install targets for each specific subsystem in the tree. Major clean up of top level Makefile thanks to Joel Becker input/suggestions. 2007-09-21 Ryan McCabe fix bz277781 by accepting "nodename" as a synonym for "node" 2007-09-21 Fabio M. Di Nitto Fix uninstall target 2007-09-19 Fabio M. Di Nitto white space cleanup 2007-09-19 Ryan O'Hara Add ability to format output and filter based on node name. 2007-09-19 Patrick Caulfield Don't use _logsys functions as I get my wrist slapped. 2007-09-19 Fabio M. Di Nitto Fix configure and Makefiles to cope with kernel built with O=/path... Original patch by Joel Becker (joel.becker at oracle.com) NOTE for developers: you will need to re-run ./configure to update make/defines.mk NOTE for packages: you might need to change the way ./configure is invoked to cope with kernel_build vs kernel_src 2007-09-19 Patrick Caulfield Fix type-punned pointer warnings 2007-09-19 Fabio M. Di Nitto Fix more warnings when building with -O2 and also fix get_rmtabd_loglevel to actually do what is supposed to. 2007-09-19 Andrew Beekhof High: RA: Remove bashism from Filesystem OCF agent --HG-- extra : convert_revision : 7dc7a64e0180018d856b84789af207c97c54b3be 2007-09-18 Lon Hohberger Fix #258141 - possible use after free in fenced 2007-09-18 Robert Peterson Resolves: bz 247318: Need man page for gfs2_edit Resolves: bz 291451: gfs2_fsck -n, Bad file descriptor on line 63 of file buf.c (addendum) 2007-09-18 Abhijith Das man page changes for new gfs2_quota reset option 2007-09-18 Patrick Caulfield check quorum device name length against the right size. 2007-09-18 Andrew Beekhof Low: RA: Add a tomcat OCF agent from Yamauchi Hideo --HG-- extra : convert_revision : d9f83224a8e819174214fda9bbdf0a9c431d53f1 2007-09-17 Robert Peterson Resolves: bz 291451: gfs2_fsck -n, Bad file descriptor on line 63 of file buf.c 2007-09-17 Patrick Caulfield Use openais logsys functions. 2007-09-16 Robert Peterson Resolves: bz 287901: GFS2: fsck errors and corruption with files > 945MB The gfs2_fsck program wasn't following enough levels of indirection when walking metadata. 2007-09-14 David Teigland go back to default of -02 now that -Werror problems are fixed 2007-09-13 Andrew Beekhof Medium: RA: Treat migrating (status 1) as running to avoid - patch by Per Andreas Buer Monitor calls fail if monitor is called during migrations which can lead to resource running on two nodes. To trigger: set up HB2 to use live migrations and flip one of the nodes into and out of standby mode. --HG-- extra : convert_revision : 7388d76b0aabac5624c84c894b7455bd99fa21dd 2007-09-11 Patrick Caulfield Fix compile with -O2 -Werror Make it compile with -O2, by fixing a very dodgy cast. Allow it to build with -O2 2007-09-07 David Teigland Do nodedown events when the confchg for the groupd cpg arrives, instead of when the per-group cpg confchg's arrive. This means all nodes should have agreed ordering on the sequence of confchg's and messages, since all messages go through the groupd cpg. This should fix bz 258121 but I can't reproduce anything like that bug to verify. 2007-09-07 Fabio M. Di Nitto So in this first patch (that seems the most urgent one): - Make prefix default to /usr - Clean up all prefix use around configure (this will make alternate prefixes like /usr/local works properly). - Add a specific --aisexecbin option that is passed to cman_tool build. This change defaults to /usr/sbin/aisexec (default aisexec install path) but also allow local override if you have aisexec installed in different paths. 3 NOTES: - the cman_tool change has not been tested in production. It builds and shows that the path is passed properly. - all people that use a prefix=/ will need to make sure to use some extra configure options to respect FHS (for example to install man pages in /usr/share/man rather than /share/man..). - If this patch goes in CVS, you need to make sure to re-run ./configure. 2007-09-07 Patrick Caulfield Correctly reduce quorum when a node leaves using "cman_tool leave remove" bz#271701 2007-09-06 David Teigland forgot the 0 after the -O go back to a default of -O0 instead of -O2 to get the stuff with -Werror to build 2007-09-04 David Teigland report that a mount fails due to an in-progress unmount Reject mount attempts on an fs that's still in the process of unmounting. This regressed 8 months ago due to the bz 218560 changes. 2007-09-04 Lars Marowsky-Bree RA: Filesystem: VARLIB -> RSCTMP --HG-- extra : convert_revision : fcb4d5a2abf39c816ef77a886cb3cbadfd732388 RA: Filesystem: Fix path reference. --HG-- extra : convert_revision : 785267b7a27c6a75c82bb06410168209a5e37143 Move INITDIR to proper include. --HG-- extra : convert_revision : 538a0f72d8383cf7c6e759173cc98f05190f0fa7 RA: o2cb: Fix path broken in recent "cleanups". --HG-- extra : convert_revision : ba36f65470ddebf7096c6ab5e276b249fb267807 2007-08-31 Dejan Muhamedagic LRM test: include parts of lrm regression testing in the BasicSanityCheck --HG-- extra : convert_revision : e9cca735b5d13b1930b115e3f064b698b6af23cc 2007-08-30 Ryan McCabe listen() is not supported on SOCK_DGRAM 2007-08-30 Fabio M. Di Nitto Collect common make targets for fence/agents written in python Collect all common make targets for fence/agents written in perl Remove old code. ACK by Lon Fix build warning 2007-08-30 Ryan O'Hara BZ 249781 - Fix ccs_tool to return EXIT_SUCCESS for most commands. 2007-08-30 Lon Hohberger Fix #229650, pass 3 2007-08-30 Jonathan Brassow file dm.h-copy was initially added on branch RHEL5. file dm-log.h-copy was initially added on branch RHEL5. 2007-08-30 Fabio M. Di Nitto Cleanup clumon/ as agreed on cluster-dev 2007-08-29 David Teigland I think I added this years ago, forget why 2007-08-29 Fabio M. Di Nitto This is the first patch of a long series to collect common Makefile targets into their own snippets. Collect all passthrough operations into make/passthrough.mk. Update all passthrough Makefiles to use the new snippet. Cleanup group/test/Makefile Cleanup gfs/tests/ Makefiles Clean up cman/tests/Makefile 2007-08-28 Ryan McCabe Fix a handful of possible NULL pointer derefs 2007-08-28 Patrick Caulfield Fix spelling of DAEMON, sigh Add a 'cman_tool debug' command that allows cman debugging levels to be changed on-the-fly 2007-08-28 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 98d06476798174cab1793807285687eeb4417c5e 2007-08-28 Fabio M. Di Nitto Remove obsoleted Makefile change the default CFLAGS to "-Wall -O2 -g". add --debug option to configure that will override the default CFLAGS to "-Wall -O0 -DDEBUG -g". clean up all the relevant Makefiles. add a few missing ; to configure script. -Wall is added by default in CFLAGS via configure to make/defines.mk. Remove all the others redundant definitions. 2007-08-24 Fabio M. Di Nitto Add dlm/tests/Makefile clean up dlm/tests/usertest/Makefile 2007-08-24 Dejan Muhamedagic RA apache: drop annoying messages if no configuration files are found --HG-- extra : convert_revision : 3a66bd7eb780482630a07ed6600fb29fdb48120d 2007-08-24 Alan Robertson hg: merged from 'dev' upstream --HG-- extra : convert_revision : 65d975ed110b0d38721e2521a301663bfadb8a88 2007-08-24 Abhijith Das fix for bz253016: userland fixes for gfs2 quota linked list 2007-08-24 Alan Robertson Changed the specfile so that it uses only the correct name for python/gtk package according to the distro --HG-- extra : convert_revision : 1692e3bcbc0fd76cb72a01a34ebe5afcbd3c258e LF bug # 1692 - monitoring fails for apache OCF resource. Put in simple fix to apache resource agent monitoring from Sebastian Reitenbach --HG-- extra : convert_revision : a245a85222cdddfbc407c62076eb1bac5bbe5af4 2007-08-23 Jonathan Brassow file queues.h was initially added on branch RHEL5. file queues.c was initially added on branch RHEL5. file logging.h was initially added on branch RHEL5. file logging.c was initially added on branch RHEL5. file local.h was initially added on branch RHEL5. file local.c was initially added on branch RHEL5. file list.h was initially added on branch RHEL5. file link_mon.h was initially added on branch RHEL5. file link_mon.c was initially added on branch RHEL5. file functions.h was initially added on branch RHEL5. file functions.c was initially added on branch RHEL5. file common.h was initially added on branch RHEL5. file cluster.h was initially added on branch RHEL5. file cluster.c was initially added on branch RHEL5. file clogd.c was initially added on branch RHEL5. file cmirror was initially added on branch RHEL5. file Makefile was initially added on branch RHEL5. 2007-08-23 David Teigland dstress fixes needed to be a little more thorough in taking a canceled transaction out of the dependency graph for cases where multiple locks were blocked between the same two transactions 2007-08-23 Alan Robertson Merged in a patch with one from Dejan to make RPMs build a little better Net effect: hopefully an improvement. --HG-- extra : convert_revision : 4d0262e6f6cf7158068796a9afe02e3bf2297370 2007-08-23 David Teigland rewording and embellishing some bits related to openais.conf 2007-08-23 Alan Robertson Put in a number of small fixes to make fedora packages make correctly... --HG-- extra : convert_revision : 9581cbd806086dcd016d4dfe2de524128c508ee1 2007-08-23 Patrick Caulfield Mention the openais.conf parameters that cman overrides. 2007-08-22 Fabio M. Di Nitto Remove fence/agents/xen from CVS HEAD. ACK on cluster-devel and IRC #linux-cluster Remove cs-deploy-tool from CVS HEAD. Last commit on this tool was from 2005 and code is available in all other branches. ACK on cluster-devel and IRC #linux-cluster Remove ddraid from CVS HEAD. ACK on cluster-devel and IRC #linux-cluster 2007-08-22 Patrick Caulfield Add some info about openais.conf parameters Add some information about setting up multi-home (redundant ring) 2007-08-22 David Teigland fix attribute xml format for cluster_id and keyfile 2007-08-22 Fabio M. Di Nitto Fix build with gcc-4.2 2007-08-22 David Teigland add new test for deadlocks use an admin handle from cman to call set_dirty mention group_tool should be used instead of cman_tool services 2007-08-22 Patrick Caulfield Update man page 2007-08-22 Fabio M. Di Nitto Cleanup FOO_RELEASE_NAME to RELEASE_VERSION 2007-08-22 Patrick Caulfield Document that cman_set_dirty() needs an admin socket. Clear error flag for SET_DIRTY 2007-08-21 David Teigland comment out the new cman_set_dirty() call; it's not working 2007-08-21 Ryan McCabe Fix access beyond allocated memory 2007-08-21 Fabio M. Di Nitto Allow the full cluster suite to build using external kernel source. Also remove the use of -idirafter that with some old versions of gcc does not behave as we expect. 2007-08-20 Abhijith Das fix for bz253172 - gfs2 init script should not unload any kernel modules 2007-08-20 David Teigland proper help output for -m option the -m mode option was being ignored and 0600 always used (this change must have been lost at the same time as the -d option) the NODIR new_lockspace flag was always being used, even if the -d was used to deselect it update ccs man pages 2007-08-20 Dejan Muhamedagic configure.in: fix for platforms with libc in /lib/tls The changeset 5f8f217f306a introduced looking for a "proper library suffix" probably in order to deal better with 64-bit platforms (it's the one in /usr/lib/heartbeat for example). However, it fails in case libc is linked from /lib/tls/libc.so, producing "tls" as a library suffix, which is obviously wrong. --HG-- extra : convert_revision : 93d093a018516fd802435c2d4b17f340a1c570d9 2007-08-20 David Teigland Call the new cman_set_dirty() api to disallow clusters both with fence/dlm/gfs state from merging. Adjust the oom setting for the daemon to avoid oom kills. 2007-08-20 Patrick Caulfield Add a "dirty" flag to cman to prevent active clusters merging with one-another. bz#251966 2007-08-20 Fabio M. Di Nitto Remove redundant gfs_ondisk.h from gfs/include/ and gfs2/include/ both files are outdated compared to the one shipped in gfs-kernel/src/gfs/gfs_ondisk.h and there should be no need of duplicates around the tree. gfs and gfs2 both use to include gfskincdir that is enough to guarantee that gfs_ondisk.h will be available at build time. gfs2/mount/Makefile requires this one liner. I spotted that we try to include gfs2kincdir that we never set and therefor it was failing to build without a local copy of gfs_ondisk.h add clean: target or make clean will fail. use TARGET8 to be consistent with TARGET3 2007-08-17 David Teigland handle addition/removal/failure of nodes during a deadlock cycle serialize deadlock cycles and limit how often cycles are started install in man dir add makefile install dlm_tool.8 Vastly simplify this man page. Include no cman or fencing information but refer to the cman and fenced pages. Outline the basic ideas of multiple methods and multiple devices. minor updates add man page 2007-08-17 Alan Robertson hg: Merged changes in from upstream + merged in some changes in configure.in. Everything else was unchanged by me. --HG-- extra : convert_revision : b16bbd412104a8e1f32dc1b736425ac8462cfb6a 2007-08-16 David Teigland mention fencing override, describe the structure of node fencing parameters in cluster.conf, point to web site for device-specific documentation add man pages 2007-08-16 Andrew Beekhof Low: RA: ManageVE - Fix status when VE files are not persistent Don't abort with OCF_ERR_INSTALLED in status if '$VZCTL status' fails --HG-- extra : convert_revision : 543759e9bac2ba345d8a81e7140d151c94eb5640 Hg: Merge with upstream --HG-- extra : convert_revision : 1183b0025c55f2657a286f5965b04d754e87f010 Medium: RA: EvmsSCC - Handle start failures caused by peers starting at the same time (Patch from Jo De Baer) --HG-- extra : convert_revision : 99286c892d4499df57fdc22528fdde6efc3ea2a0 2007-08-16 Robert Peterson Resolves: bz #240545: gfs2_fsck should behave more like the other fscks. 2007-08-15 David Teigland Update fence, fenced, fence_tool and fence_node man pages which were stuck in the RHEL4 era. clean out some options that were only relevant to rhel4 remove the monitor option which didn't do anything add the dump option to dump the fenced debug buffer (group_tool can still do this, but fence_tool wasn't oddly enough) clean out junk that was only relevant to rhel4 2007-08-15 Ryan McCabe Fix a few (harmless) places where memory is allocated but not freed I stumbled onto hunting down something else. 2007-08-15 Lon Hohberger Fix uninitialized var 2007-08-15 Simon Horman [STONITH] Disable ipmilan by default According to Sean Reifschneider the ipmilan stonith module seems to be in cronically bad health. This patch provides a configure option to enable compiling ipmilan, which is now disabled by default. Cc: Sean Reifschneider --HG-- extra : convert_revision : 11b9a169980a512fad5f4799b0729aaa049fabe0 2007-08-14 Dejan Muhamedagic RA: eDir88: fix to allow meta-data to work in case NDS is not installed --HG-- extra : convert_revision : 671820e0903cc034c30f70a3bf02483e2c70c0b3 2007-08-13 Alan Robertson Put in updates to the Informix script as supplied by Lars Forseth --HG-- extra : convert_revision : 8d99007ef4d0c0cb989be843c856a8b7025e0356 2007-08-13 David Teigland put back the ability to do pid-based deadlock detection on 5.1 kernels 2007-08-10 David Teigland Detection and resolution now works with my basic deadlock tests. Had to rework how lock state is assembled. The previous method was simpler and just gathered the master lock state from all nodes, but I failed to realize that the xid (transaction id) isn't synced to remote master copy locks. So, now all lock state is saved in the checkpoints, both master-copy and process-copy (containing the xid) which are then merged to give the full view of the lock. 2007-08-10 Fabio M. Di Nitto group/ now depends libdlm. Express this new dependency in top level Makefile to guarantee that group will be built only after dlm. 2007-08-09 Sebastian Reitenbach [RA] patch to fix bashism in resources/OCF/pgsql works well on OpenBSD See: Bugzilla #1670 Cc: Sebastian Reitenbach --HG-- extra : convert_revision : e9b44ef9b9695b6c5bd4397f2f5ff3702f819b19 [RA] patch to fix bashism in resources/OCF/o2cb See: Bugzilla #1670 Cc: Sebastian Reitenbach --HG-- extra : convert_revision : 82ba5288f1f4ff0acea957eed4724eeae9ee02a3 [RA] patch to fix bashism in resources/OCF/eDir See: Bugzilla #1670 Cc: Sebastian Reitenbach --HG-- extra : convert_revision : 3e811c10ec665047918f803c22eae87f45a5aa63 [RA] patch to fix bashism in resources/OCF/SysInfo should work, tested at the shell, on OpenBSD, not managed via heartbeat See: Bugzilla #1670 Cc: Sebastian Reitenbach --HG-- extra : convert_revision : 8a78ba8c96a36c240ceac1ca75d83c3b97f1b6c4 [RA] patch to fix bashism in resources/OCF/Stateful See: Bugzilla #1670 Cc: Sebastian Reitenbach --HG-- extra : convert_revision : 726997c2195fbc29cdb6f42f3b57caad2cf3c550 2007-08-09 Fabio M. Di Nitto Clean up some Makefiles that did not use proper openaisincdir and dlmincdir. Fix build on parisc as we did for ia64 2007-08-07 David Teigland don't add the same transaction to a waitfor list more than once 2007-08-07 Dejan Muhamedagic tools: ocf-tester: add capability to test with lrmadmin/lrmd --HG-- extra : convert_revision : 72d24d65fa250eba3ec67fb5d42c307899744649 2007-08-06 David Teigland fill in a couple more bits related to canceling the chosen lock Remove check_sys_fs() since it breaks on-demand fs module loading from the kernel (already changed on RHEL5 branch). Use strerror() instead of errno in another spot to be more user friendly. 2007-08-03 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 3de4aa3a9fd7237b5e6dd33dd13aa1f570c9e763 2007-08-03 Patrick Caulfield Don't lost the cluster name if it is specified on the command line probably the cause of bz#250688 2007-08-03 Xinwei Hu [RA] eDir88 resource can't be stopped ? eDir88 reports "Invalid argument stop" when being stopped. --HG-- extra : convert_revision : ac7faf1e9d89393076f4217ac019ce435a20a40a 2007-08-02 Lon Hohberger Fix #248727, round 2 2007-08-02 Alan Robertson Upgraded the RPM revision level from 1 to 2. --HG-- extra : convert_revision : 040c4f11efe38b8af20d10a1ce0d6774371f6ea4 More updates to the spec file. --HG-- extra : convert_revision : 77167e8d0f7885fb72d11d734aaf58db6408cd21 2007-08-01 Andrew Beekhof RA: apache - make status quieter --HG-- extra : convert_revision : 8a85b944ea4e7c32ffc584aa5081a4b0cc741bd2 2007-08-01 Alan Robertson Merged in my ids fix with the ids fix from Andrew. --HG-- extra : convert_revision : 5f563158558cf326ea4960f4442514fc78dda252 branch merge on configure.in --HG-- extra : convert_revision : a9ed93de6f37a4949b23d4663fe1155c4447f556 2007-07-31 Alan Robertson Made the version number really 2.1.3 to keep from confusing RPM --HG-- extra : convert_revision : d45731dc1acf39b026db3c95cc80bcf46f9c3a99 Incremented version number - to avoid confusion. --HG-- extra : convert_revision : 859a80bb59c71e486ed69ae01ecf8a6e5786dd01 LF bug # 1662 - massive heartbeat specfile update - to make it more usable --HG-- extra : convert_revision : 5f8f217f306a74404e2ed1f9ea771f6dd2a5b3fe 2007-07-31 Lon Hohberger Fix build problem Fix bug #248727 2007-07-31 Andrew Beekhof RA: IPaddr2 - Make the check for the modprobe/iptables utilities conditional on the IP being cloned --HG-- extra : convert_revision : 4e5babc05ae9f91f03017ffbb533751e8a865513 Build: Alan forgot to generate resources/heartbeat/ids from configure which prevented building --HG-- extra : convert_revision : 49cca21749d05ee13ba641f9f877792d3b187296 Hg: Merge with upstream --HG-- extra : convert_revision : 6bec5949cca448aa3054de67a207960f48b62645 2007-07-30 Lon Hohberger Fix #250152 2007-07-30 Alan Robertson hg: branch merge --HG-- extra : convert_revision : 3831115381136a0412d44a6ec34eac01c3d4b3ef Added Informix resource agent --HG-- extra : convert_revision : fdc97f626a31b04d7806ded020dad981d2060ac8 2007-07-30 Robert Peterson Added ability to parse and print journal information. For example: gfs2_edit -p journal2 /dev/roth_vg/roth_lv Also added the ability to jump relative block numbers. This is for when you jump to a block by "editing" the block number: Cursor up to block number, press to enter destination block. Before you could type a structure or block number. Now you can type in a structure, block number, or relative block number. For example, if you're on the superblock (block 0x10) enter +0x300 will take you to block 0x310. This number may be in decimal or hex and may be positive or negative. 2007-07-30 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : e69440f25b4c3d3d44f381e1d6bdbd9cb4262aab 2007-07-29 Alan Robertson hg: upstream merge --HG-- extra : convert_revision : f5169f3919a657e57e0a1ab788c8404f8d721a96 Removed 2.1.1.1 tag, changed release name to 2.1.2 --HG-- extra : convert_revision : 462b4034903770a0dad34ca1319a5b8c02f4e35e 2007-07-28 Alan Robertson Merge from upstream. --HG-- extra : convert_revision : 5944c3fc82a723a79eb66cc8c5ff3085cd2dc999 Patch to revert configure.in check from Tuomo Soini - LF bug 1660 --HG-- extra : convert_revision : 3ad95813d17f3d1adf0c38713ee09dca5e45cd5d 2007-07-28 Simon Horman [BUILD] Resolve datarootdir in configure When using Debian's autoconf 2.61-4 early on in configure the following definitions occur: datarootdir='${prefix}/share' datadir='${datarootdir}' It seems that unless datarootdir is evaluated, datadir can't be evluated correctly, though I must confess that I don't really understand why. I any case, without this patch, I see ${prefix} creeping into output files, like hbclient.py, but with this patch I see no such problem. Without this patch I also see the following tell-tale warinings from configure. Our Host OS: linux-gnu/i686-pc-linux-gnu configure: WARNING: datadir directory (${prefix}/share) does not exist! configure: WARNING: docdir directory (${prefix}/share/doc/heartbeat-2.1.1.1) does not exist! --HG-- extra : convert_revision : ee1ab9ae39aa9a7c9b2c1375347adae055596adb 2007-07-27 Alan Robertson Updated version to 2.1.1.1 --HG-- extra : convert_revision : 103543c02dbe65d5bead4837c63d335573bde8da Fixed a fairly-major error in IPaddr - it didn't handle CIDR netmasks correctly. But ifconfig didn't complain either... --HG-- extra : convert_revision : 21b18742065803162b95c00ca0af1fd0f292d6e1 2007-07-27 Robert Peterson Resolves: bug #248423: gfs2_tool can not set data journal flags as specified in the man page. 2007-07-26 Lon Hohberger Fix #249758 2007-07-24 David Teigland sdake says that DESTDIR=/ is correct, not /usr minor updates for cluster-2.01.00 dlm_tool deadlock_check is a way to manually kick off a deadlock detection cycle for the named lockspace add new code to find and resolve deadlocks, still incomplete, disabled by default 2007-07-23 Lon Hohberger Misc. bugfixes; see ChangeLog 2007-07-23 David Teigland Brute-force porting to 2.6.23-rc1. There are non-trivial changes for which I just copied what had been done to gfs2 without investigating whether gfs1 needs something different. 2007-07-23 Alan Robertson hg: branch merge --HG-- extra : convert_revision : 66cb5b0c36d55b65ec39802ac8dabefe929cd12e 2007-07-21 Dejan Muhamedagic apache RA (LF 1656): include ServerRoot in relative paths. --HG-- extra : convert_revision : ae378b4fc3f5147935796e0e29f866eed2fe16f8 2007-07-20 Robert Peterson Resolves: bz #247591: Make default journal size for gfs2 128M 2007-07-20 Abhijith Das Fix for bz248177: We delete the old /etc/mtab entry and add a new one during remount. Any changes made to the mount options using remount are reflected in /etc/mtab now. 2007-07-19 Alan Robertson hg: merge from 'dev' tree --HG-- extra : convert_revision : 5491b90f3ce56b29399d29687d73f41e7de6f6e5 LF Bug 1576: shutdown hangs under certain shells // LF Bug 1534: compiling/installation error on OpenBSD --HG-- extra : convert_revision : f7e3d74d0499c30a86affdc532c3ffedd47c294f Merged in changes from the 'dev' branch. --HG-- extra : convert_revision : 9c3523b5ecb3e2795ffb20d076bf59dd82aae8dd LF bug # 1650 - heartbeat should put scripts/non arch specific things in /usr/share --HG-- extra : convert_revision : 507cfb8c64250e56d18b29d77a1c885430d525b3 2007-07-18 David Teigland recent cleanup of warnings should have specified unsigned in long long casts 2007-07-17 Andrew Beekhof OCF: Provide the location of /sbin as used by some agents (HA_SBIN_DIR) --HG-- extra : convert_revision : 4d4a65a685a4750cc72a8ce19dbe684f0c9f7416 2007-07-17 Robert Peterson I added the ability to recurse indirect blocks. That means that you can now print a list of all blocks associated with a file. For example, you could print all block numbers associated with journal0 by doing something like: gfs2_edit -p 0x19 /dev/roth_vg/roth_lv assuming, of course, that you know that block 0x19 contains journal0. (You can use gfs2_edit -p jindex to get that information though). I fixed some bugs with restoremeta where it was dying if the metadata to be restored was bigger than the destination file system could hold. I made restoremeta do some "warm fuzzy stuff" to report its progress of the restore operation so the user doesn't think the restoremeta is permanently hung. I renamed some long variable names to short ones to make the code less wordy. I fixed a minor segfault if you hit "j" when your cursor is on the block number of the hex view. I probably should have committed these changes earlier rather than save them up, but some of my changes had serious regressions and I didn't have time to sort it out and fix it until just now (I didn't want to commit a broken version to cvs.) 2007-07-16 Lars Marowsky-Bree RA: MailTo: Remove spurious warning about monitor. --HG-- extra : convert_revision : fb46aeb2b784b0093f33afc44f49579951704085 2007-07-13 David Teigland add lockdump and option to set permission of dlm device when creating Make gfs-kernel compile against post-2.6.22 (2.6.23-rc) kernels. (No more sendfile which is now done via splice which gfs1 still lacks.) 2007-07-12 Alan Robertson Merged two heads from the 'dev' repository. --HG-- extra : convert_revision : d1d066d319e87bff8ebef5bcafb0d33251bd2cc8 2007-07-12 Matthew Soffen Corrected so that on *BSD it passes in 2 parameters, instead of 2. --HG-- extra : convert_revision : 76e8e77d397c38bdf61af4be103f2985875a344d 2007-07-12 Ryan McCabe Detect bridged networking configurations where additional parameters are supplied to the script. 2007-07-12 Ryan O'Hara Fix bug where mkfs always exits with EXIT_FAILURE. 2007-07-12 Lars Marowsky-Bree Merge local changes with dev. --HG-- extra : convert_revision : 29eebecbb33021f0d3e1ff93a685ef41d8af8151 RA: Dummy: Just add a comment to make sure people copying from this don't get monitor wrong. --HG-- extra : convert_revision : 2abd83d059db0e914680c324f1ef8271badffdb6 2007-07-12 David Lee import /bin/sh fix from David Lee --HG-- extra : convert_revision : fb847da0319d8fe64b6cad87c213d6c9682520a0 OCF: minor Bash/Bourne issue --HG-- extra : convert_revision : d69db76e485d398f4ebba86383c384cfdcf689bb 2007-07-12 Marek 'marx' Grac Resolves: #245178 - install RA for named (agent already in CVS) 2007-07-12 James Parsons Fix for bz238106, new firmware version issues 2007-07-11 Alan Robertson LF bug # 1534 - compiling/installation error on OpenBSD http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1534 This change cleans something up pointed out by the OpenBSD folks. Although I don't think it fixes his complaint. --HG-- extra : convert_revision : 03aceaa4e1eaa4da01e8c4dd6e67ca76a1510ec8 Upstream merge --HG-- extra : convert_revision : 81e4f2a678c2b3f6ffb1dfa7b56f2c5a338dcc1e 2007-07-11 David Teigland add a bunch of casts to quiet warnings on x86-64 print a couple decimal places for times in the debug logging 2007-07-11 Alan Robertson Patch applies to bug LF # 1534 - compiling/installation error on OpenBSD http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1534 --HG-- extra : convert_revision : 9c9b0f9aef5c2765d2e8c6c2fc0c38b3a179fa31 2007-07-11 Andrew Beekhof Fix location of pingd binary --HG-- extra : convert_revision : 07bec0aefb459f7c6cef0747d113e6af465dffb9 2007-07-11 Patrick Caulfield file dlm_query_wait.3 was initially added on branch RHEL4. file dlm_query.3 was initially added on branch RHEL4. file dlm_ls_query_wait.3 was initially added on branch RHEL4. file dlm_ls_query.3 was initially added on branch RHEL4. 2007-07-10 Lars Marowsky-Bree RA: Filesystem: Unify coding style to match rest of script. --HG-- extra : convert_revision : e0e21ed28c361a3c00e0a77a92d83f99f41f6090 2007-07-10 Alan Robertson Merge from upstream. --HG-- extra : convert_revision : e33fb9d0a07599c59a4796d8ef6f6e26aec843c0 A little cleanup to make the RAs more uniformly not have $OCF_ROOT paths in them --HG-- extra : convert_revision : e5e3dfdd2f32ef21c67aefdd436013646dd7b0d2 Fixed a syntax error in Filesystem: missing then symbol --HG-- extra : convert_revision : 499a3907e4ccf848eea94d5506740ee125f59f83 2007-07-10 Lon Hohberger Resolves: 247488 2007-07-10 Ryan O'Hara BZ 240584 - Check to see if device is mounted before creating filesystem. 2007-07-10 Robert Peterson Resolves: bz 247591: Make default journal size for gfs2 128M 2007-07-10 Lars Marowsky-Bree OCF: The compatibility wrappers used to be located in libhbdir, not in the (new) noarchlibhbdir. --HG-- extra : convert_revision : aec125907c9133bf42022ff2c879101b964d41eb 2007-07-10 Alan Robertson Removed a line I had to add earlier. Maybe we had a non-obvious merge conflict? --HG-- extra : convert_revision : 0008f2dd07b0d69f2ec70959e9f9f36eef0c8a6b Merged in upstream changes... --HG-- extra : convert_revision : aa7230f943d824e0ffd2275b97252ec3a314fb30 2007-07-10 Andrew Beekhof Build: cvsignore maintenance --HG-- extra : convert_revision : ca770e9dd6356fbe150793087fc7a44462d44580 2007-07-10 Lars Marowsky-Bree configure.in: Further fix NOARCHLIBHBDIR setting. Broken with http://hg.linux-ha.org/dev/rev/b9b4c709004b --HG-- extra : convert_revision : b9b074a9644b84c97454792c7811425814e69cf7 configure: Generate heartbeat/shellfuncs again. Fixes build broken by http://hg.linux-ha.org/dev/rev/b9b4c709004b --HG-- extra : convert_revision : aa4b34801cac2e171fa8cfab46dc8aa2ccc83f6e 2007-07-10 Alan Robertson Restored a missing file from configure.in. Must have been deleted by accident. --HG-- extra : convert_revision : 72081feacd2e494b5d694b8ad831537f8415904f LF Bug # 1617 - Miscellaneous RPM and source cleanups http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1617 --HG-- extra : convert_revision : b9b4c709004b18bf08ac2c73e7589ea429b198d2 2007-07-09 David Lee OCF: Detect and warn of use of deprecated 'ocf-shellfuncs' and 'ocf-returncodes'. --HG-- extra : convert_revision : 063b699fa0327879b5ab115b575b6431d56dd37e 2007-07-09 David Teigland Various small changes and additions. Munging formatting to avoid line wrapping on 80 columns. 2007-07-09 Patrick Caulfield remove redundant Makefile lines libdlm man pages 2007-07-09 Andreas Mock [LDIRECTORD] Add OCF wrapper OCF wrapper by Andreas Mock Cc: Andreas Mock --HG-- extra : convert_revision : b93d541ea14a498106064a7125cce7207644b9bd 2007-07-09 David Lee OCF definitions: correct minor typo in recent re-ordering --HG-- extra : convert_revision : cb51f7db12db1b7b9eead1b5b0ce8a004814c564 2007-07-06 David Lee OCF build: 'export PATH=...' is Bash; convert to Bourne --HG-- extra : convert_revision : f19a93e62a745aa2ed2cdb0314029d9dd8929df3 OCF: Minor builddir v. srcdir adjustment after recent tidy-up --HG-- extra : convert_revision : bf76990bd02f938a88a24a4c80569c2123046b3e 2007-07-06 Andrew Beekhof Build: Remove generated file from EXTRA_DIST --HG-- extra : convert_revision : 79508d954709ca00cee8e2f1ad710b042107e2a4 Build: Fix installation of .ocf-returncodes to its legacy location --HG-- extra : convert_revision : 1eb15905ada7345fddc4721db3eba869d9c6c904 Hg: Merge with upstream --HG-- rename : heartbeat/IPaddr.in => heartbeat/IPaddr rename : heartbeat/o2cb.in => heartbeat/o2cb extra : convert_revision : cbea859544873a670dbdb80ce42b0a2a8f1dd0c0 2007-07-06 Lars Marowsky-Bree RA: IPaddr: Fix a few trivial bugs in lvs_support pointed out by Michael Stiller. --HG-- extra : convert_revision : 8c6a03149a3af3c4fe0c8eebb2cc37a6b2a1a7fe 2007-07-06 Andrew Beekhof RA: IPaddr2 - handle IP_CIP not being defined --HG-- extra : convert_revision : 1a4594cf8784169e22e8ee346d1b1ea1f9fb9936 RA: IPaddr2 - Remove bash-isms --HG-- extra : convert_revision : bef30e5a2a31d81931eaa75ab2eb7620275abd83 RA: Bug #1630 IPaddr2 - Wrong arguments passed to send_arp (test was mistakenly written as an assignment) --HG-- extra : convert_revision : 49d4ef764f74076304c2292484242cefd42bce9c OCF: Fixed the check in have_binary. Handle checks for programs with --arguments Also: * Better reuse * Sane logging if ocf_log is not available --HG-- extra : convert_revision : e71a4a80a2bba919699f9e47210c65e1fa4162d8 2007-07-05 Andrew Beekhof RA: Convert a number of checks for required binaries --HG-- extra : convert_revision : e2ef2b0879ddcbf726b6998a59aad86ad67f31b4 configure: Remove unneeded AC_PATH_PROGS checks --HG-- extra : convert_revision : 4ef4979f1f8e4030ca01d686dcc5aee27795baf2 OCF: Remove unused variables --HG-- extra : convert_revision : a43f145e27a779a59260c194d227a1da8400f01b RA: Use consistent check for required binaries --HG-- extra : convert_revision : f891f3723ead3f9888692187a3bb6833823407af RA: prefer the 'test' program rather than the builtin for some reason --HG-- extra : convert_revision : 93ddbc2e780ce4caa02229162e5e41f5eddc8f49 RA: ServeRAID - remove commented out code --HG-- extra : convert_revision : 3ad928d0e4fee35d5434c8183fa232f0fafe1866 OCF: Remove conversation from installed script --HG-- extra : convert_revision : fe1504c8117faad1674339c7580e79144450d8c6 RA: Move some helper files to their new names (.files dont show up as RAs) --HG-- extra : convert_revision : 433ddba35e72e660102cc88b176eb01ea5089a58 Build: No longer generate RAs now that there is a sane way to set directory locations and find binaries --HG-- rename : heartbeat/AudibleAlarm.in => heartbeat/AudibleAlarm rename : heartbeat/ClusterMon.in => heartbeat/ClusterMon rename : heartbeat/Delay.in => heartbeat/Delay rename : heartbeat/Dummy.in => heartbeat/Dummy rename : heartbeat/EvmsSCC.in => heartbeat/EvmsSCC rename : heartbeat/Evmsd.in => heartbeat/Evmsd rename : heartbeat/Filesystem.in => heartbeat/Filesystem rename : heartbeat/ICP.in => heartbeat/ICP rename : heartbeat/IPaddr.in => heartbeat/IPaddr rename : heartbeat/IPaddr2.in => heartbeat/IPaddr2 rename : heartbeat/IPsrcaddr.in => heartbeat/IPsrcaddr rename : heartbeat/LVM.in => heartbeat/LVM rename : heartbeat/LinuxSCSI.in => heartbeat/LinuxSCSI rename : heartbeat/MailTo.in => heartbeat/MailTo rename : heartbeat/ManageRAID.in => heartbeat/ManageRAID rename : heartbeat/ManageVE.in => heartbeat/ManageVE rename : heartbeat/Pure-FTPd.in => heartbeat/Pure-FTPd rename : heartbeat/Raid1.in => heartbeat/Raid1 rename : heartbeat/SAPDatabase.in => heartbeat/SAPDatabase rename : heartbeat/SAPInstance.in => heartbeat/SAPInstance rename : heartbeat/SendArp.in => heartbeat/SendArp rename : heartbeat/ServeRAID.in => heartbeat/ServeRAID rename : heartbeat/Stateful.in => heartbeat/Stateful rename : heartbeat/SysInfo.in => heartbeat/SysInfo rename : heartbeat/VIPArip.in => heartbeat/VIPArip rename : heartbeat/WAS.in => heartbeat/WAS rename : heartbeat/WAS6.in => heartbeat/WAS6 rename : heartbeat/WinPopup.in => heartbeat/WinPopup rename : heartbeat/Xen.in => heartbeat/Xen rename : heartbeat/Xinetd.in => heartbeat/Xinetd rename : heartbeat/apache.in => heartbeat/apache rename : heartbeat/db2.in => heartbeat/db2 rename : heartbeat/drbd.in => heartbeat/drbd rename : heartbeat/eDir88.in => heartbeat/eDir88 rename : heartbeat/mysql.in => heartbeat/mysql rename : heartbeat/o2cb.in => heartbeat/o2cb rename : heartbeat/oracle.in => heartbeat/oracle rename : heartbeat/oralsnr.in => heartbeat/oralsnr rename : heartbeat/pgsql.in => heartbeat/pgsql rename : heartbeat/pingd.in => heartbeat/pingd rename : heartbeat/portblock.in => heartbeat/portblock rename : heartbeat/rsyncd.in => heartbeat/rsyncd extra : convert_revision : 4e22c4d5495a902fe14bc2bab8afd114f4efb58e RA: Add some extra variables dug out from configure --HG-- extra : convert_revision : e52ff63082ac4e2689485c446666f2540a082340 RA: Move all remaining autoconf variables into common files that are automatically included --HG-- extra : convert_revision : 43b039df431b7d03a32e8cdf4f8c5e7d1c41347e 2007-07-05 Fabio M. Di Nitto Overload Makefile to give Lon a build target and keep the style consistent across. Most important change (really) is to keep incdir as last or custom incdir build will break. 2007-07-05 Andrew Beekhof RA: Revised handling of directory locations OCF mandates that OCF_ROOT must be set before an RA is called Install our helper function into OCF_ROOT and use it to populate various directory locations used by RAs (instead of using autoconf). --HG-- extra : convert_revision : 5141b7d75fb2dae6f032b3d3ae2e73681fe3c799 2007-07-05 Patrick Caulfield Honour the mode parameter to dlm_create_lockspace() even if the device node was created by udev. 2007-07-02 Jonathan Brassow Require vg_name to be unique. Allowing multiple LVs from the same VG on different machines can lead to races when updating metadata during device failures. We can do better. This patch puts the validation in lvm.sh so that it can print out a understandable error message. 2007-07-02 Lon Hohberger Fix #237144 - pass 2. All testcases accounted for now. 2007-06-30 Lars Marowsky-Bree RA: o2cb: Improve configuration stability. --HG-- extra : convert_revision : 94d9a8c98f929affbd5a82bfbed188bb63fe71c6 RA: o2cb: Optimize redistribution of cluster.cnf. Instead of the leader pushing out changes, the slaves now compare the md5sum from the leader (used as the magic lock value) to their own, and only retrieve when the cluster.conf has actually changed. This also allows for greater concurrency and might be an interesting snippet of code for other RAs as well. --HG-- extra : convert_revision : 3031dda974e472ae5a2ea568246695bd4a6d7fe7 2007-06-29 Robert Peterson Resolves: bz 241096: GFS: bug in gfs truncate 2007-06-29 Lon Hohberger Add note to usage.txt for configuring on 64-bit environments 2007-06-29 Robert Peterson Revolves: bz 245803: GFS2: buffer count underflow for block 29581 (0x738d) 2007-06-29 Lars Marowsky-Bree RA: o2cb: Revamp distribution of cluster.conf --HG-- extra : convert_revision : bb63c03d5475494b5e407bd2854b8dd9ba468e91 2007-06-28 David Teigland - add more specific warnings/errors when connecting to gfs_controld fails - use strerror to report more helpful error messages in a few spots 2007-06-28 Lars Marowsky-Bree RA: o2cb: Initial version. --HG-- extra : convert_revision : 3ea74713a04de09eb0b4d8146be7f19c2a027689 2007-06-27 Lon Hohberger Merge from RHEL5 branch Remove testprog target. 2007-06-27 Lars Marowsky-Bree RA: Filesystem: search the path as well, if needed. --HG-- extra : convert_revision : 50c73d57f1f978a52d19af3f16ec0126614de74c 2007-06-26 Lon Hohberger Make lan+ work if built as a STONITH module 2007-06-26 Wendy Cheng Bugzilla 239729: The purge_nr in glock_scan is already a pointer. Fix error in today's check-in. 2007-06-26 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 55132e3d3f3ce809e9e4f7f95081697b1c687e99 2007-06-26 Wendy Cheng Bugzilla 239729: Accidentally moved the wrong patch - fix previous check-in. RedHat bugzilla 239727: Previous CVS check-in did a last minute change with the way purge count was calculated. The intention was to trim glocks evenly across all the hash buckets and apparently the size of hash array was overlooked. It ends up with zero trimming count most of the time. This virtually makes glock trimming patch a void feature. 2007-06-26 Lars Marowsky-Bree Merge local changes with dev. --HG-- extra : convert_revision : f03b8a4ca6f418d447512072d798926b1e32059d RA: Filesystem: Improve OCFS2 UUID retrival and error handling --HG-- extra : convert_revision : 2b26a68cafd5f577833cbc012c9c4a0a6fde7245 2007-06-26 Lon Hohberger Clean up testprog in make clean Ancillary patch to fix debug output Fix full-virt rebooting (#243872); add local-only / no-cluster mode to fence_xvmd 2007-06-26 Andrew Beekhof RA: eDir - failed stop doesn't exit with error (Yan Fitterer) Fixed bug re-introduced by last broken patch (failed stop doesn't exit with error) Fixed many space/tab cosmetic errors Updated internal version to 0.17 --HG-- extra : convert_revision : 7444cd77690aaa8241586fda1b20f0c50f5147e2 2007-06-26 Patrick Caulfield Fix timer durations 2007-06-26 Robert Peterson Resolves: bz 245360: GFS2: userland tools have problems with small block sizes Add savemeta and restoremeta functions to gfs2_edit 2007-06-25 Ryan McCabe HP changed the iLO 2 interface again in the latest firmware revision, 1.30 (released on 2007-06-01) 2007-06-25 David Teigland s/unsigned long/unsigned long long/ 2007-06-25 Robert Peterson Fix a place where indirect offsets were calculated incorrectly. 2007-06-25 Ryan McCabe Rename "private" to "priv" to make the file usable by C++ programs, and wrap the header with extern C { ... } if compiling C++. 2007-06-25 Lon Hohberger Fix missing label 2007-06-22 Robert Peterson Resolves: bz 245360: GFS2: userland tools have problems with small block sizes 2007-06-22 Lon Hohberger Make exclusive resources work again 2007-06-22 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 5459346c59fcc34d9515df5d5fb46f5bf0bce15e 2007-06-22 Robert Peterson Make gfs2_edit handle small different block sizes. 2007-06-21 Fabio M. Di Nitto Fix build on ia64 by adding a temporary workaround and make sure to wrap STACKSIZE properly withing DEBUG. 2007-06-20 Simon Horman [PATCH] Use sys/types.h instead of asm/types.h in configure(.in) #include instead of in the configure theck for icmpv6.h to make sure that __attribute_used__ is present, which is needed indirectly. This does seem to be correct to me, but there is always the chance of a regression, so feedback would be appreciated Observed with /usr/include/linux/icmpv6.h supplied by linux-libc-dev 2.6.21-2 on ia64 Debian (unstable) configure:30482: gcc -std=gnu99 -c -g -O2 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/libxml2 conftest.c >&5 In file included from /usr/include/asm/intrinsics.h:18, from /usr/include/asm/byteorder.h:10, from /usr/include/linux/icmpv6.h:4, from conftest.c:77: /usr/include/asm/gcc_intrin.h:26: error: expected ',' or ';' before '__attribute _used__' --HG-- extra : convert_revision : 3dea186a3b19cf5d358def5e454a3793aec0e977 2007-06-20 Benjamin Marzinski GNBD doesn't need to flush the cache after it looses connection with the server. Either it will be multipathed, and the multipath device will own the cache, or it won't and flushing the cache will get you nothing more than a flood of error messages all at once, instead of a stream. 2007-06-19 Wendy Cheng Bugzilla 231904: Port RHEL4 fast statfs (for commands such as "df") implementation over. The "lvb" enhancement will be followed around RHEL 5.2 time frame. Ballpark performance numbers: dhcp145 (1 cpu HP): old df took 0.875 seconds, new df 0.008 second dhcp146 (4 cpus DELL): old df took 0.808 seconds, new df 0.006 second. Activated via "gfs_tool settune statfs_fast 1" command. 2007-06-19 Lon Hohberger Fix update failure if node was fenced 2007-06-19 Robert Peterson Resolves: bz 240570: Can't mount GFS file system on AoE device 2007-06-19 Fabio M. Di Nitto Make sure to cleanup the buffer when processing each request or dirty data can be passed from one request to another. Add a barrier to make sure that the socket data are not bigger than the buffer or we overflow somewhere at random. These 2 changes should be backported to different stable branches. 2007-06-18 Robert Peterson Resolves: bz 244163: Incorrect output of gfs2_tool sb all 2007-06-18 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 369816627c18db890942ff89355f914304bd9265 RA: IPaddr2 - Refine and fix Cluster IP functionality by Michael Schwartzkopff. Check LVS support isnt enabled (at least until testing verifies everything still works) Use automatic CRM parameters (such as the number of buckets to create) Use locking to ensure the first start and last stop are processed correctly Remove unnecessary use of iptables CONNMARK --HG-- extra : convert_revision : 784fcb44d688b4f1ba7ba03438c450b4f77d7c4a 2007-06-17 Wendy Cheng Bugzilla 239729: Backport RHEL4 glock trimming patch over to improve GFS slab cache consumption issue. bugzilla 244343: Backport RHEL4 gfs datasync patch to head. 2007-06-15 Lon Hohberger Fix #243691/2 2007-06-15 Dejan Muhamedagic CTS: a new script to collect pe input files for one CTS test. --HG-- extra : convert_revision : b1ece09d47bca965a6a6850b7fd30ca13c2a2e2f 2007-06-14 Lon Hohberger Fix type size for 32/64-bit mixed clusters 2007-06-14 Marek 'marx' Grac New flag -F for clusvcadm to respect failover domain (#211469). Also changes clusvcadm -e service00 which enable service on local node and do not respect failover (same as in RHEL4, in RHEL 5.0 it just wrote Failure). Old flag -F (freeze, introduced after RHEL50) was changed to -Z. 2007-06-14 David Lee configure: lack of 'libgnutls-config' need not stop 'mgmt' and 'quorumd' from building --HG-- extra : convert_revision : ad94161a89597d1e194496ca3c8defca8ecea03a 2007-06-13 Lon Hohberger Fix status check Fix #229650 - part 2; fixes an uninitialized var problem 2007-06-13 David Teigland Block SIGINT (^C) around the three steps of mount: joining the mountgroup, doing kernel mount, adding mtab entry. And the same for doing the opposite in unmount. 2007-06-13 Patrick Caulfield Use new openais timers 2007-06-13 Fabio M. Di Nitto Wave goodbye to libcman bits :) 2007-06-13 Patrick Caulfield Don't link cman with libcman! 2007-06-12 David Teigland log an error message if we see mount.gfs killed before it's done 2007-06-11 Fabio M. Di Nitto Remove old dead code from the tree. 2007-06-08 David Teigland Return 1 or 0 GETLK result to the kernel for conflict/no-conflict. We were always returning 0 before. 2007-06-08 Ryan O'Hara Read nodir from lockspace xml node via ccs_get. 2007-06-08 Andrew Beekhof RA: ocf-shellfuncs - Add some lockfile-related functions --HG-- extra : convert_revision : 1fd5b24f31602a901d96603ffb20ea5420af7458 2007-06-07 Andrew Beekhof RA: eDir88 - Repair a patch mangled by email --HG-- extra : convert_revision : 529ba01f8bc1194ff55d851795444de623a7de4a 2007-06-06 David Teigland (copy from RHEL5 branch) New lockspace config for external dlm. Changed get_weight to look for node weight in lockspace config. translate different error numbers from gfs_controld into specific, helpful error messages return a different error number to mount.gfs for each specific failure case, so mount can translate that into a helpful error message 2007-06-06 Robert Peterson Resolves: Bugzilla Bug 242056: GFS2 needs block sizes < 4k (mkfs changes) 2007-06-06 Fabio M. Di Nitto both gnbd and gfs1 need some love for .22.. gnbd: - invalidate_bdev changed interface with commit: f98393a64ca1392130724c3acb4e3f325801d2b6 gfs1: - struct kset has been cleaned up with commit: 823bccfc4002296ba88c3ad0f049e1abd8108d30 - posix_test_lock changed interface with commit: 9d6a8c5c213e34c475e72b245a8eb709258e968c 2007-06-06 Dejan Muhamedagic RA eDir88: meta-data fixed again. --HG-- extra : convert_revision : 554c5e067f5de24f3eedb327b0ed0abd7330d97a 2007-06-06 Fabio M. Di Nitto Fix dlm/tool install and clean target Fix LDFLAGS override: /lib and /usr/lib don't need to be specified at link time. 2007-06-05 David Teigland report an error if no lockspace name is provided 2007-06-05 Wendy Cheng Bugzilla 242759: Bump into this problem while debugging bug #236565 (GFS SPECsfs panic). Apparently a minor oversight while adding new function into GFS for RHEL5. GFS versions <= RHEL4 is immuned from this issue. Upon memory pressure, VM starts to release inode cache entries that would fail gfs iget. GFS1 flags this error as "ENOMEM" but returns from gfs_create call without releasing the glock. Bugzilla 236565 Fix a race between GFS lookup code and VM cache reclaim logic kicked off under memory pressure. At the end of the lookup, gfs releases inode glock pre-maturely. This creates a window inside the bottom portion of logic that could make gfs_iget updating the associated GFS inode structure that has been freed. Depending on who gets the new memory, unspecified corruptions occur. In the case where this bug is found, it corrupts TCP buffer head that ends up trashing nfsd kernel stack. 2007-06-04 Marek 'marx' Grac Bug: 212479 - ip.sh causes /sbin/ip to produce warnings Missing netmask is parsed from /sbin/ip 2007-06-04 Andrew Beekhof RA: edir88 - Improved monitor action by Yan Fitterer Major improvements to the detection of running local ndsd processes. RA now follows same logic as official ndsmanage utility, and this removes the risk of the RA detecting eDir processes incorrectly in failure scenarios with multiple processes. Squashed nasty bug where local RA could in certain circumstances connect to remote ndsd process. --HG-- extra : convert_revision : 5d741c7a794260fc06ee50ca9bfb14583e969d4f 2007-06-01 Fabio M. Di Nitto * Fix incdir usage across the entire tree so that: - it can override standard include paths for real - it is always used after more specific inc overrides * Clean up a few Makefiles to be more consistent with CFLAGS ordering * Fix gfs-kernel/src/gfs/Makefile clean: target 2007-06-01 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : 648ed19f6d12b9bb081022dfd483c67d42a59cc1 2007-06-01 Abhijith Das Changes to fix broken code after Bob pulled out metafs mounting functionality from gfs2_quota into libgfs2. 2007-05-31 Lon Hohberger Fix 234249, 229650 2007-05-31 Patrick Caulfield open_lockspace needs to detect kernel version too, otherwise all lockops will fail mysteriously. 2007-05-31 Lars Marowsky-Bree configure: Fix test -L -> -h portability issue. --HG-- extra : convert_revision : 3e0adb1834acc40eba57eb25437bc7d90a132afe 2007-05-30 David Teigland bunch of stuff to test new features Add dlm_ls_deadlock_cancel() that allows a system daemon to cancel an application's deadlocked lock. This requires the latest dlm kernel headers. 2007-05-29 Lars Marowsky-Bree configure: Make test for /proc//exe functionality more explicit. --HG-- extra : convert_revision : c6eee6c1911e8afb4276d2fc5a0b680aaf7346a2 2007-05-29 Jonathan Brassow Require vg_name to be unique. Allowing multiple LVs from the same VG on different machines can lead to races when updating metadata during device failures. 2007-05-29 Andrew Beekhof Build: Remove legacy crud Autoconf 2.53 was released over 5 years ago... by now we can assume it is available anywhere that matters --HG-- extra : convert_revision : 83833c44a3567898064006c1f1da75b504580add 2007-05-26 Alan Robertson Updated version to 2.1.1 --HG-- extra : convert_revision : 7a1bec2ceb7b9ef6bd726019ed87c19c1a314891 2007-05-25 Wendy Cheng Apparently we can't remove these two methods from file operations table. Since gfs_read() had been changed to use do_sync_read() that requires to have aio defined in the file operations table. So vector read/write (implies NFSD) will be partially broken again after we put these two methods back. 2007-05-24 David Teigland don't do gfs_sb_print() if we don't detect a gfs fs, it often just prints a bunch of garbage 2007-05-24 Andrew Beekhof build: portability and cleanliness updates to configure.in --HG-- extra : convert_revision : ca53737679519b8efa7fbe42e6231b40a0f3696e clplumbing: Bug 1454 - Detect stale PID files where possible Linux-only patch from Kevin Jamieson, cleaned up and made usable on other systems by Andrew Beekhof. All systems heck for a running PID, Linux systems supporting /proc/{pid} additionally checks that it has not been reused. --HG-- extra : convert_revision : c62afda08611313b1a2cfea6a7382ba6dea9b928 2007-05-22 Lon Hohberger Update Add missing primary keys to SAP agents 2007-05-21 Lon Hohberger Fix typos in resource script logging 2007-05-21 Patrick Caulfield Add swab.h for compiling against openais trunk Fix typo in openaisincdir 2007-05-18 David Teigland use dlm/Makefile to build lib and tool dirs add dlm_tool, can be used to join/leave lockspace Make new features available based on recent dlm kernel patches. The kernel patches change the user/kernel device interface used by libdlm. (You'll need the new dlm_device.h kernel header installed on your system to build the lib.) libdlm is backward compatible with the old kernel interface, but some of the new features will return errors on old kernels. New API's provided by libdlm: - dlm_new_lockspace() is just like dlm_create_lockspace() but has a flags arg so flags can be passed when creating/joining a lockspace. - DLM_LSFL_NODIR and DLM_LSFL_TIMEWARN are new flags that can be used with dlm_new_lockspace(). - dlm_ls_purge() is new and can be used to purge orphan locks. - all DLM_LKF flags can now be used with dlm_lock* routines (flags above 16 bits couldn't be used before) - new DLM_LKF_TIMEOUT flag to enable lock timeouts - dlm_ls_lockx() is new and adds "xid" and "timeout" args All previous API's are still available, and programs compiled against previous versions of libdlm should still work. 2007-05-18 Patrick Caulfield Allow ccs to change the two_node flag. bz#240508 2007-05-15 Robert Peterson Resolves: Bugzilla Bug 239023: gfs2_fsck not good at fixing corrupt directory entries. 2007-05-15 Fabio M. Di Nitto Remove unused files. Rrestore the make dependencies within the same subproject (same as it was before the big rework), do some PHONY clean up, clean up a few Makefiles that were still using an old format. Patch ACK by Lon and Patrick on IRC. 2007-05-15 Abhijith Das Need to write the user/group id to the sysfs quota refresh file instead of '1' 2007-05-14 Robert Peterson Close the /sys/fs directory after using it. 2007-05-14 Lon Hohberger Make manual fencing's command line parser backward compatible; per dct 2007-05-14 Simon Horman [DEBIAN] Make a debian init script for ldirectord See http://bugs.debian.org/391974 --HG-- extra : convert_revision : b6963ae301ea3380341684e7e99340b032982bbc [DEBIAN] Make a debian init script for ldirectord See http://bugs.debian.org/391974 --HG-- extra : convert_revision : c5a529a30c462909ee0638080cd73ac4946d935e 2007-05-11 Robert Peterson Resolves: Bugzilla Bug 239844: mount.gfs2 doesn't work when _netdev is used in /etc/fstab. 2007-05-11 Ryan McCabe Convert \r\n line breaks to \n 2007-05-11 Abhijith Das we don't use this file anymore. removing 2007-05-10 David Lee plugins/stonith/{ssh,suicide}.c: Better OS portability --HG-- extra : convert_revision : 2c059d51167e81eafc583571b778a806569fb328 2007-05-10 Robert Peterson Resolves: Bugzilla Bug 234844: Need to add a "gfs2_grow" command 2007-05-10 Ryan McCabe don't try to workaround xend networking when running on a non-xen kernel 2007-05-10 Dejan Muhamedagic RA eDir88: meta-data fixed. --HG-- extra : convert_revision : b3e4fb8ebc24f1b51df2f15f78d4aa2488328afc 2007-05-09 Jonathan Brassow If misconfigured, HA LVM + mirroring can cause data corruption. We should attempt to catch configuration errors before allowing LVM resources to start. 2007-05-09 Lon Hohberger Add SAP agents; resolves #238916 2007-05-09 Jonathan Brassow People seem to think that they have to setup lvm in rgmanager even though they are using clvm. This causes the two to collide during use. The HA LVM resource script should detect if a volume is clustered and ignore it. 2007-05-09 Patrick Caulfield Don't override if it appears in cluster.conf. This allows users to disable encryption if they want. 2007-05-08 Wendy Cheng GFS(s) expects NFS fh_type and fh_len would have the same value. This is not correct. One obvious symptom is that it will fail NFS V2 (that uses fixed fh_len for all requests) mount command. 2007-05-08 Lon Hohberger Readding SAPInstance/SAPDatabase Add SAPInstance and SAPDatabase resource agents to HEAD Apply patch to fix bugzilla # 232140 2007-05-07 dejan@rondo.homenet RA apache: a major update (closes Bug 1357) - OCF_RESKEY_testregex: a new parameter - HTTPDLIST updated to include /usr/sbin/apache and /usr/sbin/apache2 - add '-L' to WGETOPTS - parsing in awk instead of shell (faster and makes 'sh -x' usable again) - updated the parser to support the latest apache Include (directories, shell patterns) - fix a bug triggered if the Listen directive contains only port (the monitor operation) - the start operation waits indefinitely until the apache really starts; a monitor operation is used to check it; NB: it typically makes at least one loop, thus making the start last for at least a second; the reasoning: user should specify how long a start operation may take - replace return with exit in the main (i.e. outside of functions) --HG-- extra : convert_revision : 8f68fd78369baeab64f26419f24ed3649a6b6172 2007-05-04 David Teigland Look in cluster.conf dlm section for protocol, timewarn, and log_debug settings and apply them to kernel if found. 2007-05-04 Wendy Cheng Temporarily disable GFS natvie AIO support since it currently breaks vector read-write (used by user mode application system call and NFSD). Will come back to fix this soon. Right now, application is expected to use posix AIO AIO call (done by libc AIO emulation). 2007-05-04 Robert Peterson Resolves: bz 229484: gfs_fsck not good at fixing corrupt directory entries Resolves: bz 238740: GFS fsck is has problems with resource groups 2007-05-04 David Lee IPaddr/monitor: rationalise some duplicated 'ping' code --HG-- extra : convert_revision : ab9b4952f0f69301bb770518e82696facab2f602 2007-05-03 Fabio M. Di Nitto Fix build system. Thanks to Alasdair for spotting the error 2007-05-03 Lon Hohberger Add test case from RHEL4 branch Fix corner case reported in #212121 2007-05-03 Patrick Caulfield Change unsigned char* to char* for compatibility with openais trunk. 2007-05-03 Fabio M. Di Nitto Remove dead code Readd ipv6 support to ccs_tool update and add verbose option 2007-05-03 David Lee IPaddr/Solaris: code re-factoring (e.g. 026bab6b8384) had lost 'netmask' keyword to 'ifconfig' command --HG-- extra : convert_revision : d43dc04880c062d7dfcf6c8a44b07016348d2d44 2007-05-03 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : e7f878eb9d2fb247cb79ec0d41da5a105f64d3b5 2007-05-02 Fabio M. Di Nitto pretty self explanatory, this code is not used anywhere. Get rid of it. When the project switched away from magma, we forgot to enable IPv6 for cluster_base_port. The patch fix that and add some checks that were marked as TODO in the code. both libccs and daemon were building and linking common/log.c. Make libccs sucks in log.o and make ccsd link against libccs. Also fix ccs top level makefile to build in proper order. allow to specify --fence_agents="list of fence agents" at configure time. default is to build all of them and add a little help section. The detection of available list is done by checking fence/agents/agent_name/Makefile presence. It also does another round of Makefile cleanup and fix the Makefiles for previously DISABLEDAGENTS. 2007-05-02 Patrick Caulfield Add const to libcman Thanks to Jim Meyering for the patch 2007-05-02 Simon Horman [IPaddr] Switch back to sh now bashims are gone The previous patch removes the known bashisms from IPaddr. Hopefully there aren't any more. --HG-- extra : convert_revision : e97ee3e25c443bd8fbfbce938a9dd78857ffe2f3 Remove bashim from IPaddr * Use sed instead of cut + bashism to calculate a MAC if one isn't provided. * Use sh patten matching instead of bashism to verify MAC syntax --HG-- extra : convert_revision : 165a9a897e3b5d609693506f8b403df38ff0f348 2007-05-01 Robert Peterson Resolves: bz 223893: gfs2_fsck unable to fix damaged RGs and RG indexes. 2007-05-01 Fabio M. Di Nitto Fix gfs2 identity exit code path 2007-05-01 Horms [IPv6ADDR] Fill in address bytes and use correct endienness This patch fix up two bugs introduced by my previous series of patches for ipv6addr * The result of scanf needs to be copied into addr and when copied it needs to be in network byte order. * The mask should be in network byte order. --HG-- extra : convert_revision : 8f18f728af07eb04de53f5ae2e6e01e2afbe56db 2007-04-30 Fabio M. Di Nitto Remove unused vars 2007-04-30 Andrew Beekhof RA: eDir88 - monitor() returning incorrect exit code when monitoring fails. Also includes additional usage guidelines. Patch from Yan Fitterer --HG-- extra : convert_revision : 492cba103cc19d3ea7426e287756ca4574958225 RA: Make sure eDir88 is installed --HG-- extra : convert_revision : d5e9a5cf19cd8422b69cfc827a077edee6a478e1 2007-04-30 Fabio M. Di Nitto Commit new build system as proposed and discussed on cluster-devel mailing list: https://www.redhat.com/archives/cluster-devel/2007-April/msg00139.html and following thread with acknoledge from other developers on IRC. 2007-04-27 David Teigland various changes 2007-04-27 Lon Hohberger Add patch from Simone Gotti to implement service freeze/unfreeze. Add simple buffer handling for later use. Re-fix #222484 2007-04-27 Ryan McCabe Work around network disruption caused by XenD's bridged networking (bz230783, bz231227). 2007-04-27 Fabio M. Di Nitto Fix build on parisc 2007-04-26 Lon Hohberger Fix #231521 2007-04-26 David Teigland change some mount error conditions to log_error() instead of log_debug() so they appear in syslog. Also set /proc/self/oom_adj. Check right away if the kernel has gfs/gfs2 support by looking in /sys/fs/. This results in a user-friendly error message instead of something like "gfs_controld error 19". 2007-04-26 Fabio M. Di Nitto Use resrules-noccs in dtest build target 2007-04-26 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : d3824c2d2904a9258b729af3602358fa4f7e757d RA: Debian Bug #420206 - Bashisms in IPaddr2 IPaddr2 makes bash-specific calls so make sure its runs with /bin/bash --HG-- extra : convert_revision : a222241cd70d6060f2a33824c08a2fc9023a41e0 2007-04-24 Andrew Beekhof RA: Return to a simple but correct Dummy RA --HG-- extra : convert_revision : 4da8dc7e0823e049574a7fe3550e536ef4ebb7eb 2007-04-24 Robert Peterson Horrible kludge to allow display/print of the rgs themselves (but not yet the bitmaps) for easier debugging of bz 223893. Example: gfs2_edit -p rgs /dev/trin_vg/trin_lv 2007-04-23 David Teigland Use realpath(3) to canonicalize path names for device and mount point. bz 237544 Look for a protocol setting in cluster.conf dlm section, and set kernel accordingly if found. Also, set /proc/self/oom_adj (all daemons will get this). 2007-04-20 Dejan Muhamedagic Dummy: Reverting the poor ole RA to it's previous Dummy state. Perhaps it should loose the delay param too. --HG-- extra : convert_revision : 414a369d38ec0b38518ebf3447448d29037faa8e 2007-04-19 Lon Hohberger Fix bug 234589 Apply patch from Simone Gotti to fix logging errors in clusterfs.sh Apply patch from Andrey Mirkin to fix 237144 Cleanups to make the resource agents behave better (return OCF_NOT_RUNNING, for example) 2007-04-19 Horms [IPv6addr] Use the 32bit wide field of in6_addr in scan_if() Currently the code relies exlusively on using the 8bit wide feild of in6_addr. There doesn't really seem to be a good reason not to use the 32bit one, though I guess there is little advantage either way. --HG-- extra : convert_revision : 774ee922abe7dd8381bb69434557064faf48eae3 [IPv6addr] scan address directly into integers in scan_if() This simplifies the addresss reading code a little by scanning it directly into 4 32bit unsigned integers which are then converted into network byte order. The former construction of scaning the address into 8 strings, each representing 16bits, and then using sfprintf to write these into a buffer as integers seems a little bit akward to me. --HG-- extra : convert_revision : edf64a81bb184478ebf4932ad69b491a14766e3f [IPv6addr] Handle scanf failures in scan_if() If for some reason the file being scanned is malformed, or overflows one of the feilds for some reason, scanf will neither return EOF nor fill in all of its parameters correctly. In the case that I observed this resulted in an endless loop. This simple fix just bails out in this case. Perhaps an error message is in order? --HG-- extra : convert_revision : 31d19acfd1b95134d7484f3ba72425f492a808d5 [IPv6addr] devname in scan_if() is too short devname is passed to scanf which will fill in a string of up to 20 bytes + trailing '\0'. So make devname 21 bytes long accordingly. --HG-- extra : convert_revision : 37271ae7f11758bcf87b016556c42fdc5e11831b [IPv6addr] Merge duplicated code from find_if() and get_if() into scan_if() I notice that find_if() and get_if() share a non-trivial amount of code. The only difference between these two functions is that the latter constructs and uses a mask, whereas the former only looks for an exact match. By creating a new function scan_if(), with a switch telling it weather or not to construct a mask based on the prefix - the default mask is all 1s, or in otherwords an exact match - the code duplication is removed. --HG-- extra : convert_revision : aec55b384434ed099113384a59971ae73ec2de5e [IPv6addr] Use memset to set mask in find_if() By using memset instead of a small for loop the code becomes a lot more compact without altering its behaviour. I think this is a good thing :) --HG-- extra : convert_revision : fe0c7fa877aa707ef5802e31c3fc0f35a9d7866d [IPv6addr] overrun in find_if() for 128bit prefixes while reading over the IPv6addr code I notices that there is an overrun in find_if() in the case where the prefix is 128. In this case, mask.s6_addr[16] will be accessed twice, but that array only has 16 elements. The patch below takes the simple approach of just treating 128 as a corner case and skiping the offending parts of the mask manipulation accordingly. It also reverses the way the mask is seeded, removing bits rather than adding them, to ensure that the corner case is all 1s rather than all 0s. --HG-- extra : convert_revision : b4bc188b4ebe94824e042a674770c90ee4335469 [IPv6addr] create_pid_directory() leaks dir Currently create_pid_directory() never frees the buffer it allocates, resulting in some memory leakage. --HG-- extra : convert_revision : 1fe9db602a041259821b6ff77e9a41e244d36249 [IPv6addr] send_ua() is leaking l Currently send_ua() does not call libnet_destroy() on the handle that it creates using libnet_init() resulting in some memory leakage. --HG-- extra : convert_revision : bdbfe70474d29ca6baee2c8010802f60790b6ec6 2007-04-18 Lars Marowsky-Bree Merge in more dev changes. --HG-- extra : convert_revision : e985df0a78315379bcfe7845d09869547122ebc7 Merge eDir88 changes with dev. --HG-- extra : convert_revision : f885067d02326aec28163318d7605aacb1b65a2f RA: eDir88: Updates from Yan Fitterer. --HG-- extra : convert_revision : dd142e11cc01593f4b3e2d69270c3f3b444ad7be Renamed eDirectory RA to eDir88. --HG-- rename : heartbeat/eDirectory.in => heartbeat/eDir88.in extra : convert_revision : 098085ff648e3362c46dfd0076f89165ea1439cf 2007-04-18 Dejan Muhamedagic RA: db2 got new parameter: admin. Fixes Bug 1485. --HG-- extra : convert_revision : d01c94a1126759d261947ad717e3a2d38608f2fc 2007-04-18 Jonathan Brassow Bug 236580: [HA LVM]: Bringing site back on-line after failure causes pr... Setup: - 2 interconnected sites - each site has a disk and a machine - LVM mirroring is used to mirror the disks from the sites When one site fails, the LVM happily moves over to the second site - removing the failed disk from the VG that was part of the failed site. However, when the failed site is restored and the service attempts to move back to the original machine, it fails because of the conflicts in LVM metadata on the disks. This fix allows the LV to be reactivated on the original node by filtering out the devices which have stale metadata (i.e the device that was removed during the failure). 2007-04-18 Lon Hohberger Fix dtest.c compile errors Add obvious requirement on shared resource case as suggested by Simone Gotti fix depends.h/depends.c 2007-04-18 Patrick Caulfield Install udev rules file 2007-04-18 Lars Marowsky-Bree Wrong closing parenthesis. --HG-- extra : convert_revision : f3c4c217a2ebf79f2d4bee19f52104d381006bcc Add eDirectory RA to configure.in. --HG-- extra : convert_revision : b396e318288b43227818f44f8982c0f899e08a8f RA: eDirectory: Some cleanups. --HG-- extra : convert_revision : fb7399480c5b6c6bbc90b2af19e68b7304a7ea37 RA: eDirectory: Initial merge of code contributed by Yan Fitterer. --HG-- extra : convert_revision : 8ecbff996f7e4ee9a2eef0b75090f45791404428 2007-04-17 Dejan Muhamedagic RA: Dummy and Delay fixed to use OCF_RESOURCE_INSTANCE. Dummy got a couple of extra variables to control TERM signal handling and verbosity. --HG-- extra : convert_revision : 03a001f6113d73afe9a8fdd2b3a75df86628c04f RA: ocf-shellfuncs.in invokes ha_debug instead of ha_log when appropriate. --HG-- extra : convert_revision : 7bee8bcb73366c4bf95b79a356cebbcf5b02f9a6 2007-04-12 Lon Hohberger Fix watchdog race on rgmanager exit; BZ#236204, patch from Andrey Mirkin 2007-04-05 Lon Hohberger Make agents more OCF (Open Cluster Famework) compliant 2007-04-04 Abhijith Das fix for bz 225199 - Same as GFS1 fix in RHEL4 (bz 210362). We don't run throug the entire gfs_quota sparse file to do a list operation anymore. We get the layout of the gfs_quota file on disk and only read quota information off the data blocks that are actually in use. Also added functionality to GFS_IOCTL_SUPER to provide the metadata of the hidden quota file. 2007-04-04 Lon Hohberger allow ocfs[2] to work with the clusterfs resource agent. Also, commit patch which corrects interval processing for status operations 2007-04-04 Ryan McCabe Make power on work correctly for RIBCL version 2.22 on both iLO2 and iLO: for the former, HOLD_PWR_BTN is used to both power the machine on and off; when the power is off, PRESS_PWR_BUTTON has no effect. For the latter, HOLD_PWR_BUTTON is used to power the machine off, and PRESS_PWR_BUTTON is used to power the machine on; when the power is off, HOLD_PWR_BUTTON has no effect. 2007-04-03 Lon Hohberger Kill VM machine immediately; patch from Jeroen van den Horn 2007-04-03 Robert Peterson Resolves: Bugzilla Bug 235061: gfs_fsck: Bad programmer! You forgot to catch the ÿ flag. Resolves: Bugzilla Bug 235060: gfs_fsck: Bad programmer! You forgot to catch the ÿ flag. 2007-04-03 Dejan Muhamedagic RA: fix meta-data to conform to XML. --HG-- extra : convert_revision : 6fe09dd951ca603e57ddfc88df6d5ba90ad25d34 RA: fix meta-data to conform to XML. --HG-- extra : convert_revision : 1904c858f70a42878778222964d83e200585af16 2007-04-02 Dejan Muhamedagic RA: Convert SAP* from dos :) Allow ManageRAID to show meta-data at all times. --HG-- extra : convert_revision : 227147de538f86f2e263e9fecfbb515e6687ab80 2007-03-29 Robert Peterson Jump from RG index was broken. 2007-03-28 Lon Hohberger Fix SPARC / HPPA build; patch from Fabio M. Di Nitto 2007-03-28 Patrick Caulfield Remove udev file from here as it is confusing. The real one is in ../scripts Newer versions of udev prefer == to = 2007-03-28 Lars Marowsky-Bree RA: Filesystem: Fix metadata require/unique settings. --HG-- extra : convert_revision : 7436765798410ab54851d4661ab6bf8c4abcba61 2007-03-27 Lon Hohberger Merge patch from Crosswalk team Team: Leonard Maiorani, Scott Cannata, Henry Harris * Always check malloc() return codes * Fix errant clu_unlock() calls in vft.c in cases where clu_lock() failed * Add ability to wrap pthread_mutex / pthread_rwlock calls for better stability * Fix improper pthread_mutex_destroy() semantics 2007-03-27 David Teigland latest version, "stress" test running correctly 2007-03-27 Patrick Caulfield Actually, MAX_INT is a bit of a bad idea under this new system. Fix bug where we could free an lksb while dlm_lock is still using it. 2007-03-26 Robert Peterson Resolves: Bugzilla Bug 233083: Wrong link command in gfs2-utils mkfs/Makefile (with solution). Resolves: Bugzilla Bug 232124: gfs2_fsck will create multiple lost+found directories. Resolves: Bugzilla Bug 232019: gfs2_fsck doesn't fix an ea problem. 2007-03-23 David Lee configure: Add closing information to user about '/etc/passwd' and 'make install' --HG-- extra : convert_revision : 3f4236a16927ebb9ea08fe204831a2431d918975 2007-03-23 Dejan Muhamedagic RA: Fixed the three pseudo RAs to use ha_pseudo_resource. Extended the Dummy RA to support checks for serialization and to survive the TERM signal. As Alan pointed out, the pseudo RAs should be using the common interface which has already been done in heartbeat/shellfuncs. The Dummy RA has been expanded in order to support testing. First use: LRM regression tests. --HG-- extra : convert_revision : 97f81d18c5f4e308a25984539a6c454d681b1d87 2007-03-23 Lon Hohberger Remove dead code; fix build_tree loop Merge ordering patch from RHEL4 branch; update automated test cases Use more strict build options 2007-03-22 Lars Marowsky-Bree RA: drbd: If drbd is stopped, demote should fail with NOT_RUNNING. --HG-- extra : convert_revision : e279e9ae3c1033ea70a0b5dd021e0dd34fed0240 RA: Filesystem: Harmless typo corrected. It's harmless, because the value not being set expands to zero, which is the desired result. But, kind of stupid anyway ;-) --HG-- extra : convert_revision : 9465871dff2f353d166ae476bde4326389c23d77 2007-03-20 Alan Robertson Merged in upstream changes. --HG-- extra : convert_revision : 6890a1f01de4a5ed71a99fa1a7b27cd3694e9649 Linux Foundation bug 1528: IPaddr2 gives a spurious warning on stop (SGI961736) http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1528 --HG-- extra : convert_revision : 3d06184b4f224430dd39aa1131852484682b6a13 2007-03-20 Lon Hohberger Fix multimaster bug: ensure timings are accurate and provide multi-master conflict resolution Fix clean target; patch from Fabio M. Di Nitto Force release of lockspace; patch from Patrick Caulfield Apply build cleanup patch from Fabio M. Di Nitto 2007-03-15 Patrick Caulfield If the machine is multi-homed, then using a truncated name in uname but not in cluster.conf would fail to match them up. Support IP(v4) addresses in cluster.conf per bz#232068 2007-03-14 David Lee Replace multiple derivations (poor) of locations with direct derivation from configure --HG-- extra : convert_revision : 39c06de955f3d336319a415336ab27d884b7bd62 2007-03-13 Dave Blaschke OSDL bug 393 - Enable build of plugin if OpenHPI present --HG-- extra : convert_revision : adafc56fec02130d484d30a1ade1e3dea5dffd41 2007-03-12 Andrew Beekhof Hg: Merge in local migration fixes --HG-- extra : convert_revision : 0ab7b398d309c84f847582bf0acfd060f5d0936a 2007-03-11 Alan Robertson Updated version to 2.0.9 --HG-- extra : convert_revision : e665c504932d9a88708bbf51f1e0561fd9b5e50e 2007-03-11 Andrew Beekhof RA: Novell #250273 - Provide an RA for managing evmsd as a resource Starting evmsd from ha.cf means that * it will be stopped before the CRM can stop the container resources (EvmsSCC) * it will not share the same view of the cluster (since heartbeat/crmd will still be active when evmsd is stopping) --HG-- extra : convert_revision : a762d57ca20279d5191072d1f6233d43d3a7c92b 2007-03-10 Lon Hohberger Strings cleanup. Enable vm.sh live migration. Fix help message 2007-03-08 David Lee Ensure 'awk' is found in runtime environment --HG-- extra : convert_revision : 12f354fed9b9982a1326ddfe6a92419835659d66 2007-03-08 Ryan McCabe Add 'M' to the getopt string to keep clusvcadm from complaining that M is an invalid option. 2007-03-08 Alan Robertson Linux Foundation bug 1505: Findif parsing is all screwed up unless all the parameters are passed, not defaulted. I changed findif to use the OCF parameters that are passed to both IPaddr and IPaddr2, which got rid of a bunch of ugly code at the same time as fixing this bug. Sounds like a win to me. In the process I had to make IPaddr and IPaddr2 somewhat more consistent than they had been - but not in an incompatible way, and this is also a good thing. --HG-- extra : convert_revision : 187c17c0769e7c659d519944b0821ea4939cd515 2007-03-07 Lars Marowsky-Bree BSC: Test Dummy, IPaddr, IPaddr2, Filesystem RAs. --HG-- extra : convert_revision : 4e8d7a86ee07955dfd2699cf303b7cb4918e0020 2007-03-07 Lon Hohberger Fix 213241 2007-03-06 Ryan O'Hara Fix help message to refer to script as 'fence_scsi_test'. 2007-03-06 Lon Hohberger Add open failure message Fix missing newline in debug message Resolves: 231151 Enable auto-fallback to no-authentication for fence_xvm if fence_xvm.key does not exist. 2007-03-06 David Lee Correct a typo. --HG-- extra : convert_revision : 653bca0deda01dac3ff0095ef238c6662beebea0 2007-03-06 Lars Marowsky-Bree Merge dev changes into a single head again. --HG-- extra : convert_revision : fcd3125141c644b40aa656a68cba3b6f27c46048 2007-03-05 Lars Marowsky-Bree Merge dev with local changes. --HG-- extra : convert_revision : c39f9f5f17eb08f0e8d162fbcb782b9fc04db074 2007-03-05 Andrew Beekhof build: make life easier on OSX --HG-- extra : convert_revision : c166e4f054e02a91254eca687fc8cc6d0ad5998c 2007-03-05 Lars Marowsky-Bree Whitespace cleanup and minor fixes ("" for -z arguments, using -ne when comparing numbers, etc). And no, you may not like this white-space style either, but at least the file now is consistent. Having to read several different styles in one file was getting on my nerves ;-) --HG-- extra : convert_revision : 5ea2f901165d5c92f63eb834bb2d8ff167f20384 RA: Filesystem: Prune duplicates from active and starting list. Could otherwise cause a spurious error message when we try activating a node twice - second call to ln would fail. --HG-- extra : convert_revision : 68908630c758948ee49725113300e7714c3abc1e 2007-03-04 Andrew Beekhof Hg: Merge in local fixes --HG-- extra : convert_revision : 743bee6291da6af8c26fbd7c3c7bc79b7aea0f35 RA: Return after performing a stop action in the Filesystem agent --HG-- extra : convert_revision : f2591d23b636eb006f4dcb1079822e3e963b0e7b 2007-03-03 Lars Marowsky-Bree RA: Xen: Fix status function for newer Xen versions (Novell 250625) --HG-- extra : convert_revision : 802d7868f2f842760b2cbe3c09658cb7b3916cda 2007-03-03 Andrew Beekhof build: dopd doesnt build without the crm, disable it if the crm is disabled --HG-- extra : convert_revision : 8f6c8921fb1f64c636af5d17ce96d96b551976a9 2007-03-02 Alan Robertson Trivial documentation fix: added the --enable-snmp-subagent to the configure help message. --HG-- extra : convert_revision : 3f072b960ec55e503fd7a5ee85e030189a4815e2 2007-03-02 Lars Marowsky-Bree Fix typo. --HG-- extra : convert_revision : cfdee9080ba3974ca4d1460c932432b550e15ed6 RA: Filesystem: Move ocfs2_init out of the generic stop path (Novell 250603). --HG-- extra : convert_revision : d191dbdd65da08201f3ddefc9fede6ba8ef87bb3 2007-03-02 Andrew Beekhof build: Add explanatory comment for configure kludge --HG-- extra : convert_revision : 5d737671552a5536cc9ed443503e047ce9203d77 2007-03-01 Andrew Beekhof Hg: Merge in upstream RA changes --HG-- extra : convert_revision : af4f49d24c27fe4b0e745138baf659e27def61bd 2007-03-01 Lars Marowsky-Bree RA: LVM: Add some hints for future todo items. --HG-- extra : convert_revision : 9ef1287680f85a75706633787af03a38b0f0ff03 RA: EvmsSCC: More fixes from Jo (Novell 199730) --HG-- extra : convert_revision : bd9d356503ac5c0f45a57f35842fc157dd89ec51 2007-02-28 Robert Peterson Made hex editing a lot easier (for bz 229484). Fixed several bugs regarding printing. Added ability to print/view gfs1 journal index. 2007-02-28 Andrew Beekhof Build: Misc changes to allow building with crm disabled --HG-- extra : convert_revision : a55d3c407f18efbb5fd30280f615d4cc989dba67 2007-02-27 David Teigland un-comment-out gfs-kernel and gnbd-kernel since they now build on upstream kernel updates lots of changes, biggest is new "stress" test 2007-02-27 Stanko Kupcevic Fix for bz230134 (can't fence port 1:1 with fence_apc) 2007-02-27 Andrew Beekhof RA: pgsql fixes and enhacements * Use the correct variable names * Add sleep 1 to the wait-loops * Add debug logging to the wait-loops * Indicate when we escalated the stop to use -m immediate --HG-- extra : convert_revision : ad7fda7da428547fe01e56e02a829c6c84899177 2007-02-26 Robert Peterson Resolves bz: 229222: gfs2_fsck stuck in infinite loop 2007-02-26 Andrew Beekhof RA: Fix variable initialization in pgsql --HG-- extra : convert_revision : ef0fdb0c2262a8c7a0d1a7c3720cfca50816b74b Build: Fix typo in configure.in --HG-- extra : convert_revision : ada338610b9949aa36f479b3ed4d5ffca47b10a3 2007-02-22 Lon Hohberger Make status checks happen at 'start' time (parent-before-child) instead of 'stop' time (parent-after-child). 2007-02-22 David Teigland remove the self paramter (sync up with RHEL5 branch) 2007-02-22 Andrew Beekhof Build: Header-file related cleanup * Move headers generated by configure to include/ * Re-instate an "external" header for use by out-of-tree projects * Include all (but only) relevant defines in the external header Ie. directory locations, and user/group names/ids but not which functions, headers or libraries are installed * Do not include internal headers in external ones Doing so prevents out-of-tree builds * Do not re-define variables in Makefiles (ie. -DSOMEVAR=@SOMEVAR@) This is what config.h is for * Do not create multiple names for the same #define * Give portability.h a more appropriate name * Document the header files managed by configure and how they are created * Dont define variables that dont exist to be be other variables that also dont exist * Dont expose internal build options (like if we use alphasort for sorting) * Tweak the names of some #defines for clarity and to avoid namespace polution * Remove dead options --HG-- extra : convert_revision : 24a89cbdacde614b4f6e771aeaf51c184d436f49 2007-02-26 Andrew Beekhof RA: pgsql improvements * Avoid needlessly re-aliasing environment variables * Give stop_wait a more indicative name (ie. it controls escalation) * Make sure the resource is dead before returning from stop * Remove start_wait, this is more consistently done with action timeouts * Simplify the start action by having it call the monitor action to ensure it has started before returning --HG-- extra : convert_revision : 2e9b22cfb7e12b08ecd6d6dd9416419bbb73c305 2007-02-23 Lars Marowsky-Bree RA: pgsql: Several enhancements to by Keisuke MORI 1) In 'start', wait until the postmaster gets ready to answer by checking as same as 'monitor' does. The maximum wait time to complete to startup can be customized by an additional parameter 'start_wait'. 2) Add a cleanup code for 'postmaster.pid' when stop and before starting. 3) In 'stop', wait until the postmaster completes to the fast mode shutdown. The maximum wait time to complete to shutdown can be customized by an additional parameter 'stop_wait'. --HG-- extra : convert_revision : d24ab11e98116d43d9b1a18e0aafd469baca654a 2007-02-23 Andrew Beekhof RA: Filesystem - When checking ocfs2 is cloned, look for variables that are always set --HG-- extra : convert_revision : 959f2c429fc323588a33d7a66589944be4d12fae 2007-02-22 Robert Peterson Resolves: bz 229601: gfs_tool fails to report counters 2007-02-21 Lon Hohberger Fix anonuid/anongid parsing in nfsclient.sh Resolves: 222445 * Only let one status queue thread spawn at a time Other: * Misc tweaks to alloc.c for debugging Resolves: 229338 * Makes zero-heuristic mode work (#229338) General (small) fixes: * Add time stamp to status file * Hush stdout/stderr from init script * Give lots of information in status file if debug mode is enabled Fixes for clusters with long failover times (e.g. 2+ minutes): * Enable status file generation during initialization loop * Allow termination (e.g. service qdiskd stop) during initialization loop * Add tunables for clusters with long failure detection times (e.g. 2+ minutes) Add example test configuration for dtest Remove ancient / unused script Check in missing header 2007-02-20 Lon Hohberger Add missing comment Initial checkin of simple dependency engine Fix 229254 - extraneous man pages, 228823 - allow disable of services stuck in 'stopping' state 2007-02-20 Robert Peterson Resolves: bz 229220: gfs_fsck stuck in infinite loop 2007-02-19 Ryan McCabe Support power on/reboot for iLO2 2007-02-19 Patrick Caulfield Add -c clustername to help output. Add delay switch 2007-02-19 Lars Marowsky-Bree RA: Xen: Add a little reminder to the code. --HG-- extra : convert_revision : 5137466411499928efa0f8239db516bb75101ffc 2007-02-19 Patrick Caulfield If exec fails, then tell the parent process. 2007-02-19 Lars Marowsky-Bree RA: drbd: Make parsing of drbdadm output more robust. drbdadm output is not exactly always consistent. --HG-- extra : convert_revision : d7237a392adad26babee0e240d42f33105ea2298 RA: Filesystem: Increase default suggested timeouts. --HG-- extra : convert_revision : 827f9c7f4200bc40e07f7c4d749e93e0e9c7781c 2007-02-18 Andrew Beekhof build: commas mean something in autofoo --HG-- extra : convert_revision : 425d26d3c996c2bcfc5e6877557a720526c04e13 Hg: Merge in local changes --HG-- extra : convert_revision : a769aaa17a54a75ccee9be020a148d58143adfe1 Build: Always process the Valgrind logging option --HG-- extra : convert_revision : 5275421847335d7455f018b3bbabc07b783ed5d4 2007-02-18 Lars Marowsky-Bree RA: drbd: When using the nodename-override, drbdadm likes to print an annoying notice which needs to be filtered out before parsing the output. --HG-- extra : convert_revision : 8c6b4231dffd9b18f96d847876a2611fb3ba81d0 2007-02-16 Lars Marowsky-Bree Remove CVS artifacts. ($Id$ and $Log$ have no meaning to mercurial.) --HG-- extra : convert_revision : 0e5d870d89c498a5ae72587bed72cf5a38596e88 2007-02-16 David Lee libgnutls: if no 'libgnutls-config', try 'AC_CHECK_LIB(...)' --HG-- extra : convert_revision : 6b2e1453775620e59e210fdc5b3cf4206d1587ec 2007-02-15 Lon Hohberger Add LVM failover agent; by Jon Brassow 2007-02-15 Ryan McCabe - Document the -S/passwd_script fence params. - Update the copyright notices. 2007-02-15 David Lee Determine GNUTLS cflags centrally (follow-on from 7a742e29e3f8). --HG-- extra : convert_revision : 16823e8931db15781b9a7a1ce0bad09c4791c705 2007-02-15 Lon Hohberger Fix missing copytobin target for RHEL4 branch 2007-02-15 David Lee Determine GNUTLS libs centrally in 'configure.in' rather than in individual Makefiles --HG-- extra : convert_revision : 7a742e29e3f83d3469eb0650360ee85698871bb6 2007-02-14 Lon Hohberger Add member_util.sh to installation Add member_util.sh functions Add RA installs to trunk; Make sure utility stuff is installed in the right place 2007-02-14 Ryan O'Hara Ignore EPIPE error when sending response. This can happen is, for example, rgmanager makes a request, ccs receives/processes the request, but rgmanager dies before ccs can send the response. Also added retry if we catch EINTR during write. 2007-02-14 Lon Hohberger Make service.sh understand lvm RA type 2007-02-14 Andrew Beekhof Build: Not all platforms need -lpam --HG-- extra : convert_revision : 3fceb816ac71dc20adc3c112b2afae6190d91ab2 2007-02-14 Stanko Kupcevic Support "passwd_script" parameter in python fence agents. If both "passwd" and "passwd_script" parameters are specified, "passwd_script" will be used first (if it fails, fencing will be attempted using "passwd" parameter). 2007-02-14 Patrick Caulfield Don't report 0 exit status as a failure. 2007-02-14 Stanko Kupcevic fence_apc_snmp ignores "port" parameter 2007-02-14 Patrick Caulfield Add man page info for ccs_tool addnodeids 2007-02-13 Ryan McCabe Support the "passwd_script" parameter in the C fence agents. 2007-02-13 Lon Hohberger Remove fence_manual; only provide manual-failure override Apply fixes from RHEL4 branch 2007-02-13 Andrew Beekhof Hg: Merge in the CIB remote listener --HG-- extra : convert_revision : ad01e969ff1f75e8f1adfcb416f188b296bce2bb build: Always check for the presence of PAM an TLS headers --HG-- extra : convert_revision : 048448b1535c428dc2a88afaaa54aa2aaab771ea 2007-02-13 Horms Enhance Pgsql ocf resource to handle multiple instances Patch from Martin Bene: When trying to run & monitor several postgres instances I found I need an additional parameter for the port an instance is listening on. CC: Martin Bene --HG-- extra : convert_revision : 29ae8e263a4789f041425c396ba066debabc5c88 2007-02-13 Robert Peterson Misc improvements. Better scrolling. You can now scroll through the rindex. The superblock now has a pseudo-extended display. Fixed the file offset calculations for indirect pointers. It still has some bugs but it's better than it was. 2007-02-12 Matthew Soffen Added count of how many inet lines to use for the diff -B? --HG-- extra : convert_revision : a7df45e2486e936a64b3df084268627601f82585 2007-02-12 Ryan McCabe Update the perl fence agents to take the additional command line option -S or stdin param passwd_script= 2007-02-12 Robert Peterson Resolves: bz 221743: gfs2_fsck errors still Resolves: bz 222308: mkfs and journal addition for GFS2 should produce contiguous journals. 2007-02-12 Andrew Beekhof Build: Remove redundant and useless items from the build --HG-- extra : convert_revision : 62a77874bdf48dd7b055a64589998e343c818fd0 Build: Make supplying a Valgrind suppression file simpler Also decouple using Valgrind from enabling libc malloc --HG-- extra : convert_revision : 2fceec5a51bfcda98861c1f1f6afd61fb4d7929d 2007-02-09 Andrew Beekhof Hg: Merge in memory-leak related fixes --HG-- extra : convert_revision : f5744ff6ef4ddd3aae209854981a8f9124f6ec16 2007-02-09 David Teigland clear configfs stuff if we get SIGTERM, this is a convenience if you want to kill dlm_controld and remove the dlm module without leaving the cluster. Otherwise you have to manually clear configfs dirs. If the only two groups were two dlm lockspaces, then during recovery, the first would detect the all_nodes_all_stopped condition and move on to the starting state, and the second would never get a chance to detect the all_nodes_all_stopped state since the event state of the first was no longer FAIL_ALL_STOPPED. Use a separate flag to indicate that the all stopped state has been reached instead of relying on the event state. 2007-02-09 Andrew Beekhof Build: Configure switch to easily enable Valgrind'ing of the CRM --HG-- extra : convert_revision : c5b7cd8fddd86f030ddd19f2638b78efac57a650 2007-02-08 Robert Peterson Fixed some bugs and made some improvements. When displaying indirect blocks, it now gives the data offset. Also added page-up/page-down/home/end navigation to block list display. 2007-02-07 Lars Ellenberg moved drbd peer outdater into contrib/ fixed make rpm --HG-- extra : convert_revision : b46578d11acf926dcc8f5fda934b416afda343e9 2007-02-06 Lon Hohberger Don't query rgmanager if the user only wants a node state 2007-02-05 Lars Marowsky-Bree Remove a few work-arounds now that notifications work correctly. --HG-- extra : convert_revision : ef5945fc1fbee4956abfed4a55d160b530d84573 Remove a bunch of dead code (Perl/SWIG bindings). --HG-- extra : convert_revision : 876fe13ae3a54545ac6e423265b6701fd8948523 configure: Check for more pre-requisites of the mgmt/quorum components. Patch by David Lee. --HG-- extra : convert_revision : e18da980360907d4efe924bcb2addff0cbee2be4 2007-02-02 Robert Peterson Misc updates to bring gfs-kernel up to the 2.6.20-rc7 and similar kernels. Also fixed some minor typos. 2007-02-02 Andrew Beekhof Build: Include the exact version being built even when building an archived tree --HG-- extra : convert_revision : 813df3509c408753ed3d8e66df23b8192d8f811a Finally get a chance to correct lmb's spelling --HG-- extra : convert_revision : 76937279d6b2e69353e571e90ad55212cb9a4c20 2007-02-01 Patrick Caulfield Add threads example 2007-02-01 Lars Marowsky-Bree Advertise migrate_from and _to actions. Mention migrate in meta-data. --HG-- extra : convert_revision : b97e72c3beb181471ef7c156f05cec8375e7ed61 2007-01-31 David Teigland join lockspace, optionally sleep, leave the lockspace. useful for testing, also useful to clear/release a lockspace from any random process that exited without releasing its lockspace 2007-01-31 James Parsons New apc agent written in python that supports named outlets and outlet groups, minus perl pain. Addresses bz172179 and bz134489. yee haw fix for bz220946 fix for bz205457 2007-01-31 David Teigland test program like gfs's 'alternate' but using an lvb instead of a file. nodes take turns incrementing the counter in the lvb 2007-01-31 Patrick Caulfield Read the LVB every time, rather than not at all. 2007-01-29 Ryan O'Hara If no password is specified, pass a "-P ''" to the ipmitool to prevent it from prompting for a password. 2007-01-29 Lon Hohberger Add error reporting if msg_open fails; patch from Josef Whiter 2007-01-29 Lars Marowsky-Bree RA: drbd: Fix some issues and work-arounds for drbdadm bugs. --HG-- extra : convert_revision : 44c4b218f3c07c87cf96ca16f743ebd0bb662e36 2007-01-26 Lon Hohberger Fix 223519 Port fix for logging of errors in config from RHEL5 branch Add list_prepend macro Add override for action timings Clean up test cases merge fixes from RHEL5 branch Fix #222484 2007-01-26 Ryan O'Hara file scsi_watchdog.conf was initially added on branch RHEL4. file scsi_watchdog was initially added on branch RHEL4. file scsi_reserve.sysconfig was initially added on branch RHEL4. 2007-01-26 Lon Hohberger Patch from Fabio Massimo Di Nitto - Fix portability of getuptime function 2007-01-25 Andrew Beekhof RA: Fix up the Dummy metadata --HG-- extra : convert_revision : ee45974689289e3a8aa6a0c3ed85ebde7d663350 2007-01-24 Robert Peterson Resolves: bz 222299: gfs knows of directories which it chooses not to display 2007-01-23 Robert Peterson Resolves: bz 222759: gfs_mkfs doesn't zero data after gfs superblock Resolves: bz 223500: gfs2_fsck runs slower than previous version Resolves: bz 223843 GFS2: gfs2_fsck segfaulting on corrupt extended attributes Resolves: bz 223506: gfs2_fsck: fatal: invalid metadata block This is a crosswrite from gfs1. 1. Fix a memory leak in pass1b. 2. Improve performance of pass1b by combining loops through fs. 3. Give an error message and abort if file system > 16TB and node architecture is 32-bits. 4. Give users an "Abort" "Continue" and "Skip" if they interrupt with ctrl-c. Also, report progress for that pass on interrupt. 5. Added more "percent complete" messages for other passes. See bz comments for more details. 2007-01-23 Lon Hohberger Use /proc/uptime by default instead of gettimeofday(2) for internal timings to avoid problems when the clock is reset by NTP 2007-01-23 Robert Peterson Resolves: bz 222933: regression: fence_tool no longer times out after 300 seconds 2007-01-23 Andrew Beekhof RA: -eq is for integer comparisions --HG-- extra : convert_revision : eb466fa101bf88f4d6b32dbfce77b021ba40f49d 2007-01-23 Lon Hohberger Simple manual override for fenced & example replacement for fence_ack_manual 2007-01-22 Lon Hohberger Resolves bugzillas: #213533, #216092, #220211, #223002, #223234/#223240 Detailed comments: * Lock in memory to prevent being swapped out * Turn on RR scheduling for main + score threads * Let qdiskd wait for CMAN to start * Add option to qdiskd to stop CMAN if qdisk device is not available * Make qdisk interval timings more accurate * Add option to reboot node if qdiskd detects internal hang > failure time (e.g. interval*tko, in seconds) * Add per-heuristic tko counts for unreliable heuristics (e.g. ping packets) * Remove nodes from quorate mask immediately on eviction * Update man pages with better examples * Don't let >1 instance of qdiskd be started * Clarify logging output. * Improve data in status_file. * Allow qdiskd to run with no defined heuristics (master-always-wins mode). * Make fencing of nodes optional (default = on). * Make sure CMAN is running before we try to talk to it at each point. 2007-01-21 Lars Marowsky-Bree Fix a typo in the default path to scp. (Only mattered if scp wasn't installed at compile time.) --HG-- extra : convert_revision : 2c91e83b3747dec84df361a60291efbb8180fabe 2007-01-19 Robert Peterson Resolves: bz 222871 gfs_fsck runs slower than previous versions 2007-01-19 Lars Marowsky-Bree Add some defaults to configure.in to reduce buildrequires. --HG-- extra : convert_revision : 0855cf458913f3b48be8540f18eed366ee700ddc 2007-01-17 Robert Peterson Resolves: bz 222743: gfs_grow gets the rgindex out of order. 2007-01-17 Lon Hohberger Fix #222961 - required for Conga to work. 2007-01-17 Patrick Caulfield Fix typo. thanks Bob. If we get killed by another node then print the reason in English rather than just a number. 2007-01-16 Robert Peterson Resolves: bz 222747: Remove references to lock_gulm from cluster man pages 2007-01-16 Lon Hohberger Resolves: #222485; patch from Simone Gotti Makes relocation work correctly. Apply patch from Simone Gotti; fixes #222744/#222838 2007-01-16 Lars Marowsky-Bree Add support for overriding the hostname drbd uses based on the clone number, to support floating peers. --HG-- extra : convert_revision : 5d3bd17fcc23b49fbbf1dc7234178c0b6e4fbcaf 2007-01-16 Patrick Caulfield Don't return to 'cman_tool leave' until we are just about to quit. Otherwise there can be a delay between cman_tool thinking that we are down, and the node really being out of the cluster. see bz#222686 2007-01-16 Lars Marowsky-Bree Silly typo. --HG-- extra : convert_revision : c8a9c53e728eb637cfd520e33d30596cde27a2b6 2007-01-15 Lon Hohberger Fix bug causing cluster.conf / rm log level to be ignored in resource agents 2007-01-14 Andrew Beekhof RA: Modify the Xen RA to use the new variable names --HG-- extra : convert_revision : ed38642e18c8c77ef1d8016cd5193cf7217c8d82 2007-01-14 Lars Marowsky-Bree RA: Xen: Swap migrate_to and _from to match PE. --HG-- extra : convert_revision : 92de87ae3f26f087726ad3fd4434b1200fac3b6f 2007-01-12 Andrew Beekhof Hg: Merge with upstream --HG-- extra : convert_revision : bf67c60382b185e7ead21543dff140e2fe0dd81b Hg: Merge back late 2.0.8 changes from Alan --HG-- extra : convert_revision : bbe39035b485c7cb3358b874e476c13f63c489e6 2007-01-12 Alan Robertson OSDL 1292 - xSeries STONITH (IBM: 06-R212-175 --HG-- extra : convert_revision : 2d298bca0d0af320752bfa293ac96ed08e2c6463 2007-01-12 Lars Ellenberg add configure option for the drbd outdate peer daemon --HG-- extra : convert_revision : d16e543a064f163b45c595943746b62ca2ae6f43 2007-01-11 Lars Marowsky-Bree Add migration support to Xen RA. --HG-- extra : convert_revision : 26d5068a03195c9d8a850b5526b06824dabe8ae6 2007-01-11 Andrew Beekhof Hg: Merge in RA patches from the community --HG-- extra : convert_revision : 0459c0dd6814cc69ff39cfa375cb42c7fb0365e7 2007-01-11 David Teigland Move memset(0) into the for loop so we're clearing the data buffer each time through. We were seeing some bogus data from group_tool -v. 2007-01-11 James Parsons bz222234 2007-01-10 Lon Hohberger Resolves: #221210 Allows fence_xvm to respond even if the virtual machine has never existed in the cluster. 2007-01-09 Andrew Beekhof Hg: Merge in migrate support --HG-- extra : convert_revision : 9a5d21817e74568bc6930553a88c6b240c01b377 2007-01-09 David Teigland add -K option to enable dlm kernel log_debug's (does nothing if /sys/kernel/config/dlm/cluster/log_debug doesn't exist) 2007-01-09 Andrew Beekhof RA: Support reload and migrate-(to|from) in the Dummy OCF agent --HG-- extra : convert_revision : 93fb820ee0df736c2d7adec60bb5e2df26343cb2 RA: Fix apache metadata for consistancy --HG-- extra : convert_revision : cd207a8743859925951d381f8096ca4983abe8ac .cvsignore maintenance --HG-- extra : convert_revision : 84f206380f9cc89a975ce73806a68040d18f1d52 2007-01-09 Patrick Caulfield Add flood program back. quorumdev_poll is in milliseconds, not seconds! Thanks to Simone Gotti 2007-01-09 Andrew Beekhof RA: Some improvements to the mysql RA suggested by Achim Stumpf --HG-- extra : convert_revision : dfd62039ac1290f544cba4957d8705c79e91ceaf 2007-01-08 Patrick Caulfield Don't lose NUL on the end of the fence-agent. thanks to Simone Gotti for the patch If there are already queued messages for a client then don't send new ones out of order 2007-01-06 Benjamin Marzinski Get GNBD compiling with the latest upstream kernel. 2007-01-05 David Teigland groupd creates uint32 global id's for each group. It doesn't use them itself, but provides them to each registered app to use if it wants. (The dlm and gfs each use the global id in messages to distinguish between different lockspaces/fs's.) groupd's method of creating these gid's (local counter | local nodeid) can result in duplicate gid's in the cluster given a somewhat uncommon sequence of events. bz 221629 2007-01-05 Patrick Caulfield Clear the node structure before calling cman_get_node(). Thanks to simone.gotti@email.it Send correct length of quorum device name sent to cman. Thanks to simone.gotti@email.it for the patch 2007-01-04 David Teigland added "flood n mode" function a while ago, doesn't equate to pjc's "flood" program even though that's what I was originally hoping to emulate 2007-01-02 David Teigland mount/umount modifications of /etc/mtab weren't smart enough to get straight two different fs's mounted on the same mountpoint bz 218560 2007-01-02 Patrick Caulfield Give a better error if the cluster name is too long. 2006-12-21 Robert Peterson Resolves: bz 219876: mount.gfs hangs if there are insufficient journals configured in the filesystem 2006-12-20 David Teigland Support mounting a single fs on multiple mount points. bz 218560 2006-12-19 David Teigland Fixes related to the needs_recovery state and first-mounter recovery. Probably not perfect yet, but working in the tests I'm able to contrive. bz 218551 When the first mounter is recovering all the journals, it should use TRY on the journal locks. There's one rare case where other mounters will exist who hold journal locks that we don't want to block on. That's when the other mounters are readonly, haven't been able to recover the fs after a node failure, and the next rw mounter is told to do first mounter recovery. The journals of these readonly nodes can be skipped when the pseudo-first mounter is going through all journals. Changed this a long time ago but never checked it in. bz 218551 2006-12-19 Robert Peterson Resolves: bz 219878: gfs2 creation should default to 1 journal and lock_nolock 2006-12-19 David Teigland revert last checkin 2006-12-19 Patrick Caulfield Fix bug where cman_dispatch(CMAN_DISPATCH_ONE) could dispatch several messages. 2006-12-19 Robert Peterson Resolves: bz 218560: multiple mount points fail with gfs and gfs2 Resolves: bz 219866: GFS init script - FATAL: Module lock_dlm is in use. 2006-12-18 Lon Hohberger Implement cap on max # of outstanding status check threads; fixes bugzilla #218697 2006-12-18 Benjamin Marzinski GNBD was hanging with the cfq scheduler, so I changed the default scheduler for all gnbd devices to the anticipatory scheduler. 2006-12-18 Patrick Caulfield Increase token timeout to 10s as per bz#216954 2006-12-16 Andrew Beekhof Hg: Merge crm-stable into dev --HG-- extra : convert_revision : b0a827c6f1aa62acdcc703958817f00e841dd8ec 2006-12-15 Benjamin Marzinski make it so that the -c and -[u|U] flags are mutually exclusive. Resolves bz 219413 2006-12-15 Patrick Caulfield Add cluster_id override field to cluster.conf, so that people can manually assign cluster IDs where the hash values for similar names clash 2006-12-14 Lon Hohberger Fix #216774, pass 3 Fix #216774 2006-12-14 David Teigland Switch from CMAN_DISPATCH_ONE loop to CMAN_DISPATCH_ALL to resolve delayed cman shutdown callbacks. bz 219385 2006-12-14 Lon Hohberger Fix #216774; missed rg_thread.c 2006-12-14 David Teigland Switch from CMAN_DISPATCH_ONE loop to CMAN_DISPATCH_ALL to resolve delayed cman shutdown callbacks. bz 219385 2006-12-14 Andrew Beekhof RA: Always use the calculated netmask This allows the admin to specify either type of netmask in the netask field --HG-- extra : convert_revision : 9d22dd725b70ca95790c6280f3f082242f4d05bf 2006-12-14 Lars Marowsky-Bree Fix IPaddr meta-data to only advertise netmask parameter, which accepts both formats. --HG-- extra : convert_revision : f38f059836a8f8d3f061e916151263b539f0ece9 2006-12-13 David Teigland groupd's function that returns info for group status queries was mistakenly setting the "member" status to 0 when a node was leaving. This led fence_tool to believe that the local node was no longer a member (i.e. had finished leaving) when in fact the leave wasn't complete yet. bz 219385 2006-12-13 Lon Hohberger Fix #211468 - clustat always returns 0, but should give a nonzero code for non-running services. 2006-12-13 Lars Marowsky-Bree New OCF RA for rsyncd from Oza Dhairesh. --HG-- extra : convert_revision : 718e9851b075962f4197f81cf34fcd0d59a21346 Add OCF RA EvmsSCC to support EVMS2 shared containers and run evms_activate at the right times. --HG-- extra : convert_revision : 3a000cc91d6c4dcdcd71ce53e4cc9be42ca58834 2006-12-13 Lon Hohberger Fix segfault in clustat if node is not a cluster member 2006-12-13 Patrick Caulfield Fix typo that could affect shutdown. see bz#219385 2006-12-12 David Teigland add lock_flood/unlock_flood/unlock_flood-exit commands to test doing large volumes of locks/unlocks 2006-12-12 Alan Robertson merge of code pulled from 'dev' --HG-- extra : convert_revision : 1c8f6a5c9da99d966723b5961c709859120fb8b6 Undid test logging code in IPaddr accidentally committed to Hg --HG-- extra : convert_revision : 398d3b3ee6bdb9025034b7dccfd959b83ceedf5a 2006-12-11 Lars Marowsky-Bree Novell 187080: ocfs2_init shouldn't be called for monitor / status operations. --HG-- extra : convert_revision : e8573fd6bf1a34f16ed5fcdbba579e62373edd0b 2006-12-08 Abhijith Das don't fail if unmounting configfs fails 2006-12-08 David Teigland tidy up some prints 2006-12-07 David Teigland very useful testing program I wrote a long time ago 2006-12-07 Lars Marowsky-Bree Merge crm-stable with dev. --HG-- extra : convert_revision : 1b5f016641ee149a4246b828e4c050ffb3e19332 Merge SAPDatabase and SAPInstance resource agents, courtesy of the SAP LinuxLab. --HG-- extra : convert_revision : 60625f83f1f18f280da11530325943c4ff2dddcc 2006-12-07 Patrick Caulfield Fix minor bug where cman_tool join didn't spot that aisexec had started correctly or crashed. This means we can up the timer for allowing aisexec to start with no ill effects. see bz#218688 2006-12-05 David Teigland Call into the lock module to do a withdraw instead of just calling BUG. bz 215962 When lockfs is called from the vfs (due to a dm suspend), don't try to do the lockfs if the fs is being shut down (due to a withdraw). bz 215962 Pass gfs_controld the device being mounted, it'll use this if it needs to withdraw the fs. bz 215962 Before doing the mount-group portion of withdraw, fork off a dmsetup to suspend the fs device. This means gfs doesn't need to call dm_suspend() in the kernel before calling out to us. The suspend waits for all outstanding i/o to return on the device which is necessary prior to telling other nodes to do recovery. (Later we should probably swap in an error table and resume the device.) bz 215962 change the default plock rate limit from 10 to 100 bz 216052 2006-12-04 Alan Robertson Trivial Comment and error message text updates. --HG-- extra : convert_revision : efa0bfbe81a64411d7efef345cf002aea2d60018 2006-12-04 Lon Hohberger Fix build error 2006-12-04 Andrew Beekhof RA: New parameter ctl_opt added to pgsql to support additional options for pg_ctl Patch supplied by Serge Dubrouski --HG-- extra : convert_revision : a31dc0be8451e47c5f1ada0aad9087b9f105ea82 2006-12-01 Robert Peterson Resolves: bz218134: GFS & GFS2: umount while busy gives bogus error message. 2006-12-01 Lon Hohberger Handle 0.1.9 case of libvirt returning a virDomainPtr + state for a VM that doesn't exist (vm state == VIR_DOMAIN_SHUTOFF) 2006-12-01 Abhijith Das bz 190196. gfs2_quota. Doesn't use sysfs anymore. Uses the gfs2meta filesystem instead. 2006-12-01 David Teigland Be more intelligent about handling recovery sets so we can deal with cases where a node fails, rejoins, then fails again before recovery has completed for the first failure. Also handles case where the groupd process exits without the node going down. If that happens, we want to kill the node (via cman) if the node was in any groups and ignore it otherwise. group_tool dump doesn't handle partial reads/writes, now we always dump entire fixed size debug buffer bz 214540 2006-12-01 Patrick Caulfield That 'if' really should have been a 'while'. If anyone can remember which bug this was supposed to fix, please pipe up :) 2006-12-01 Lon Hohberger Fix bug where fence agents were getting info up to groupd 2006-11-30 Lon Hohberger Fix bug reported by Fabio M. Di Nitto - duplicate definition of assign_noccs 2006-11-30 David Teigland From: Steven Dake We dispatch in the dispatch handler now instead of saving the data. Also we use dispatch all which will basically try to dispatch all messages possible in one go instead of only dispatching on each loop through poll. 2006-11-30 Robert Peterson Resolves: bz217798: Need to port Resource Group optimization from gfs1 to gfs2 2006-11-30 Patrick Caulfield Don't truncate the node name when we check for it unqualified. bz#217724 2006-11-29 Alan Robertson Updated the version number --HG-- extra : convert_revision : d245f9f0394938de683b167580b9199be319d653 2006-11-29 Robert Peterson Resolves: bz213763: mkdir takes more time on larger file systems. Made gfs_mkfs use RG sizes based on size of file system to maximize performance. Resolves: bz217436: Several updates needed to cluster.conf man page. 1. Removed references to gulm. 2. Other misc changes. Resolves: bz217436: Several updates needed to cluster.conf man page. 1. Added required nodeid="x" to cluster.conf example. 2. Added tag to cluster.conf example. 3. Added section on cluster.conf validation (Credit Jim Parsons) 4. Fixed spelling and grammar problems. 2006-11-28 David Teigland the fix yesterday to prevent a segfault when mount failed mistakenly also changed the exit point from the function causing the error to not be written back to mount.gfs 2006-11-28 Robert Peterson Resolves: bz216902: mkfs.gfs2 allows non-4K block size. The executive decision was made to remove the -b option in mkfs.gfs2 until we can get all of this sorted out with the gfs2 kernel. Resolves: bz217460: fence_tool man page updates needed. Thanks go to Fabio Massimo Di Nitto for contributing this. 2006-11-28 James Parsons file fence_baytech.py was initially added on branch RHEL4. 2006-11-28 Patrick Caulfield Tell cman when the config file has been updated 2006-11-27 David Teigland if mount fails, don't try to save the mg info for the new group since there won't be any mg and we'll segfault 2006-11-27 Ryan O'Hara Fix comment. Fix exit status be rval. 2006-11-27 Lon Hohberger Fix #213878 - segfault in rg_thread.c due to improper loop semantics 2006-11-27 Wendy Cheng bugzilla : 217374 - temporarily disable GFS1 withdraw until bz215962 is ready. 2006-11-27 Ryan McCabe Add DRAC5 and DRAC4/I support Related: #211836, #211918 2006-11-23 Andrew Beekhof Merge crm-stable back into dev --HG-- extra : convert_revision : a3cf1c79e2e89727cb21319851b9cd7dc526c086 2006-11-22 Robert Peterson Resolves: bz216898 mkfs.gfs2 needs to zero the first 16 blocks of file system 2006-11-22 Andrew Beekhof RA: Report status failure when an IPaddr is active on a different interface but allow it to be stopped Indicate when configured values are ignored Reinstate some OCF logging --HG-- extra : convert_revision : 026bab6b838476d7c5b2e14ad955a81c527fbae4 2006-11-21 David Teigland handle errors or short reads when reading /dev/misc/lock_dlm_plock 2006-11-21 Lon Hohberger Fix #213218 2006-11-20 David Teigland use timersub() macro to subtract timevals instead of coding it fix a couple of problems if openais enables flow control: - the poll loop spins due to plocks being ready to process but being ignored due to the flow control; we need to remove the plock fd from the poll set when flow control is enabled (just like we do when the plock rate limiter is active) - we were not updating the flow control state from openais when flow control was enabled unless we received a cpg message; we need to update it periodically while blocked since we may not receive cpg messages from other nodes causing us to update the state The plock rate limiting code should use the full timeval to measure the 1 sec limit interval instead of just the rough difference in tv_sec values. 2006-11-20 Robert Peterson Fix another case where lf_dirent_format was not rewritten to disk after it was fixed. 2006-11-17 Robert Peterson Resolves: bz208836 - fatal: invalid metadata block 1. Fix a memory leak in pass1b. 2. Improve performance of pass1b by combining loops through fs. 3. Give an error message and abort if file system > 16TB and node architecture is 32-bits. 4. Give users an "Abort" "Continue" and "Skip" if they interrupt with ctrl-c. Also, report progress for that pass on interrupt. 5. Added more "percent complete" messages for other passes. See bz comment #33 for more details. 2006-11-17 David Teigland if read() returns a non-EINTR error then abort if read() returns a non-EINTR error then shut down the client 2006-11-17 Wendy Cheng Bugzilla 214274: Oops... only directIO has this issue - buffer IO should be fine. Revert buffer io changes. 2006-11-17 Robert Peterson Resolves: bz215817 umount caused a 'filesystem consistency error' kernel BUG 2006-11-17 Wendy Cheng Bugzilla 214274: GFS has been splitting large writes into smaller atomic transactions. This would generate multiple aio completion calls (one for each transaction) that falsely notify application about data completion. Problem is reported by QA team as data corruption. 2006-11-16 Ryan O'Hara Fix annoying whitespace inconsistency. Detect and fix potential endia problem in lf_dirent_format. 2006-11-15 Lon Hohberger Fix error reporting from cman if run while xend is not running. 2006-11-15 David Teigland fix sched_priority from sdake uncomment scheduler settings fix sched_priority from sdake 2006-11-14 Robert Peterson Resolves: bz211465 fsck errors on gfs2 volume 2006-11-14 David Teigland Default plock rate limit of 10 instead of 0. Add plock rate limit option -l . Current default is no limit (0). If a limit is set, gfs_controld will send no more than plock operations (multicast messages) every second. Given a limit of 10, one file system where plocks are used, and a program that does a tight loop of fcntl lock/unlock operations, the max number of loop iterations in 1 second would be 5. If eight nodes were all doing this there would be 80 total network multicasts every second from all nodes in the cluster. We also record the volume of plock messages accepted locally and received from the network in the debug log. A log entry is written for every 1000 locally accepted plock operations and for every 1000 operations received from the network. 2006-11-14 Robert Peterson Ability for gfs2_edit to handle gfs1 indirect metapointers. 2006-11-13 Ryan O'Hara Include sd_freeze_count in counters output. This will allow users to see the freeze count via gfs_tool counters. 2006-11-13 Chris Feist Fixes to prevent compile time warnings/errors in brew. Need to include directory for ccs.h header file. 2006-11-13 Lon Hohberger Fix bugzilla #212474; fully integrates fence_xvmd with ccs & the cman init script 2006-11-10 Benjamin Marzinski fix for bz215095 & 215099. for 215099, gnbd now only handles signals in sock_xmit() when it is called by the gnbd_recvd process. Otherwise, it simply blocks the signals until it completes the IO. This keeps gnbd from sending partial requests to the server, which can lead to data corruption. for 215095, the gfs function clean_journal() now uses the noinline attriubute, gfs_find_jhead() only uses on struct gfs_log_header, and gfs_recover_journal() dynamically allocates its struct gfs_log_header, all to conserve stack space. In the gnbd function sock_xmit(), you no longer get the signal info, so gnbd_recvd cannot print which signal it received, but it saves over 120 bytes of stack space. 2006-11-09 Robert Peterson This is the fix for Bugzilla Bug 214524: group_tool dump can give short output. This is the fix for Bugzilla Bug 214625: Add group_tool log function to group_tool and groupd. This is the fix for Bugzilla Bug 214621: Allow gfs2_edit to view, print and edit gfs(1) file systems. 2006-11-09 Patrick Caulfield Set join_timeout and consensus_timeout to higher defaults as per bz#214920 2006-11-08 Robert Peterson This is the fix for Bugzilla Bug 214513: gfs2_convert must reject file systems with block size != 4K. 2006-11-08 Patrick Caulfield Always compile in debug logging - you never know when it might come in handy and it's disabled by default anyway. 2006-11-06 Alan Robertson OSDL 1443: Make it so we can build RPMs with management daemon disabled NOTE: This also disables building the GUI package --HG-- extra : convert_revision : 67b3d9bdab1cec58a67610f82f0ad3b8f712852f 2006-11-06 Marek 'marx' Grac Bug #213524. Resource agent for named + patch for stopping applications 2006-11-06 Huang Zhen RA: add resource agent for IBM websphere 6 --HG-- extra : convert_revision : 9d5209dd3145a1a16b2e35d40f4671d8d08d8f4a RA: Convert original SendArp heartbeat RA to OCF one, rewrite heartbeat one with calling OCF one --HG-- extra : convert_revision : 37292ecf326e4a9234bf3103cf56bcf51a7d60e1 2006-11-06 Patrick Caulfield if an AISONLY node dies, mark it DEAD bz#213747 comments 9-13 (ish) 2006-11-03 Ryan O'Hara Added fence_scsi_test to help test SCSI reservation capabilities. 2006-11-03 Lon Hohberger Fix bugzillas #212444, #212433 2006-11-03 David Teigland When a new master joins the mountgroup, it retrieves plocks from the ckpt created by the old master, then unlinks and closes the ckpt so it can create another new ckpt later. Bug found by sdake where the ckpt close following the unlink was being skipped because the ckpt handle wasn't being set. 2006-11-03 Patrick Caulfield fix bz#213747 Basically we don't let a node join a cluster that already has "Disallowed" nodes in it as we don't consistently know the state of the cluster in that case (it could be two inquorate halves for example). Sorry, Steven, this is yet another instance where cman has to exit() the aisexec process for the greater good of the cluster. I've also enhanceed "cman_tool nodes" to show the disallowed nodes and a warning message that the cluster is in a bit of a mess. 2006-11-03 Alan Robertson OSDL bugzilla: 1442 - Port to AIX --HG-- extra : convert_revision : 3820abc75685b71be3232334f565a4e2b76b1f17 2006-11-02 Alan Robertson Put in patches to make it compile on AIX (round 1) --HG-- extra : convert_revision : 6441ab304d6d96ed2c9736b572a1ce7ace0f6475 2006-11-01 alanr@servidor.linux-ha.org Fixed a bug in the DB2 resource agent: Incorrect exit codes for both status and monitor operation - for the case where DB2 was stopped. This doesn't affect R1-style configurations - since it treats all non-zero return codes as equivalent. --HG-- extra : convert_revision : 968e8f8e995e38021edec078a58f91bcc4a1ea7f 2006-11-01 Lon Hohberger Apply patch to fix build on newer kernels from Fabio M. Di Nitto 2006-10-31 Patrick Caulfield On Steven Dake's recommendation, also up the token_retransmit count to 20. and fix a couple of typos. Set the default token timeout to 5 seconds. It can still be overridden in cluster.conf if required. 2006-10-30 Lars Marowsky-Bree Use CFLAGS when building ccdv to make a warning from the build system go away. --HG-- extra : convert_revision : 94243820b0118b0d113764320468d6e750b0ff29 2006-10-30 Patrick Caulfield Lon's patch to user /etc/sysconfig/cman for customisation. bz#212393 2006-10-27 Wendy Cheng Bugzilla 211622 - Root issue is found and fix. Backout the workaround. 2006-10-27 Andrew Beekhof Merge crm-stable back into dev --HG-- extra : convert_revision : feeff9c4270d43f51f5abbb513e72d0ccc8d177a 2006-10-26 Abhijith Das bz 211418. Modified gfs2_tool and gfs2_jadd to use the new inode flags in fs.h instead of deprecated iflags.h 2006-10-25 Lon Hohberger Update Changelog Fix #212074 2006-10-25 Patrick Caulfield fix CMAN_DISPATCH_ALL. Patch from Mikhail A Zelikov which got inexplicably lost, sorry. 2006-10-25 Benjamin Marzinski This is a bugfix for bz #211923. When can't mount a filesystem because you already have it mounted, or some similar reason, you print out a helpful message, so the user can fix the problem. 2006-10-24 David Teigland recent commit fixing bz 210344 removed the memset so we're getting garbage back sometimes Clear out configfs dirs that we've created before exiting. Allows dlm kernel mod can be removed straight away now. bz 211924 (code setting scheduler priority also added but commented out) clean up gross code 2006-10-24 Wendy Cheng Bugzilla 211622: GFS1 will asserts at xmote_bh() if DLM grants SHARED lock to direct IO's DEFERRED request. Add LM_FLAG_ANY to direct read to allow relaxed state and change direct write to use EXCLUSIVE lock. 2006-10-24 Lon Hohberger Fix #211701 (rgmanager + clustat hangs), #211933 (xenvm rename -> vm) 2006-10-23 Benjamin Marzinski Really gross hack!!! This is a workaround for one of the bugs the got lumped into 166701. It breaks POSIX behavior in a corner case to avoid crashing... It's icky. when NFS opens a file with O_CREAT, the kernel nfs daemon checks to see if the file exists. If it does, nfsd does the *right thing* (either opens the file, or if the file was opened with O_EXCL, returns an error). If the file doesn't exist, it passes the request down to the underlying file system. Unfortunately, since nfs *knows* that the file doesn't exist, it doesn't bother to pass a nameidata structure, which would include the intent information. However since gfs is a cluster file system, the file could have been created on another node after nfs checks for it. If this is the case, gfs needs the intent information to do the *right thing*. It panics when it finds a NULL pointer, instead of the nameidata. Now, instead of panicing, if gfs finds a NULL nameidata pointer. It assumes that the file was not created with O_EXCL. This assumption could be wrong, with the result that an application could thing that it has created a new file, when in fact, it has opened an existing one. 2006-10-23 Abhijith Das Adding Josef's noquota mount option for GFS1 in RHEL5. Original bz 205285 2006-10-23 David Teigland Patch from Abhi to fix case where a node's mount is rejected by other group members causing gfs_controld on the mounter to leave the group immediately. It was sometimes leaving before its join was even finished which caused groupd to reject the leave, so we need to wait for the join to complete before doing the leave. 2006-10-21 Robert Peterson This is the fix for Bugzilla Bug 210344: group_tool does not handle short reads. 2006-10-20 Lon Hohberger Roll back patch to resrules.c Roll back patch to clusterfs.sh Fix 202637 - error reporting missing from some agents Compatibility fix for resource agents between linux-cluster and linux-ha 2006-10-20 David Teigland we weren't cleaning everything up for a client upon POLLUP 2006-10-20 Robert Peterson Fix for Bugzilla Bug 211405: If groupd segfaults, dump the most recent log information. This is the fix for Bugzilla Bug 210732: ccsd doesn't spot cluster going quorate. The fix was written by Patrick Caulfield, but I tested it and it now works properly. I'm doing the commit because Patrick is out today. 2006-10-19 Andrew Beekhof RA: OSX Needs a slightly different command to remove an IPaddr --HG-- extra : convert_revision : 5acef5f9a608e87e6923d818c6bffb772edf4932 2006-10-19 Lars Marowsky-Bree When the IP was bound on the lo* interface, but lvs_support was not explicitly enabled, the "status" operation would hang by passing an empty filename to grep. --HG-- extra : convert_revision : bf2b6e70b8ae060df471f1bb20895b86631ac6b5 2006-10-18 Robert Peterson This is the fix for bugzilla bug 211337: must create core files for daemons on segfault. 2006-10-17 Wendy Cheng Port RHEL4 GFS AIO (asynchronous IO) implementation into RHEL5/FC6 and community-version of GFS1. 2006-10-17 Patrick Caulfield Get notifications BEFORE getting state otherwise we have a race condition. probably fixes bz#210732 2006-10-16 Lon Hohberger Updated xenvm resource agent 2006-10-16 David Teigland Recent changes to mount scenarios (mounts while another node is doing first mount recovery) added a couple places where we need to clear the "save_plocks" flag to allow a new mount to begin processing plock requests. 2006-10-16 Patrick Caulfield 'while' should be an 'if' 2006-10-16 David Teigland typo, deleting "rs" instead of "re" when cleaning stuff up fix typo in debug message fix style badness A node that was just added would incorrectly conclude that the node after it needed to do first mounter recovery. 2006-10-16 Lon Hohberger Fix #209544 - umount failing on gfs/nfs services 2006-10-16 Patrick Caulfield Sigh, got the condition back-to-front. This should fix the AISONLY status (again). 2006-10-15 Wendy Cheng Just found 2.6.18 kernel has something called down_read_non_onwer for rwsemaphore. If we can implement a similar function that does something like "up_write_if_owner", then we can put i_alloc_sem back to correct state. Correct the comment and mark this possibility. Bugzilla 203170 - direct IO deadlock: We'll have the same deadlock as described in bugzilla 173912 without RHEL4 kernel DIO_CLUSTER_LOCKING flag. To work around this issue, the i_alloc_sem is dropped from GFS. We expect glock will be able to handle the local synchronization. 2006-10-14 Robert Peterson This is the fix for bugzilla bug 210369: acls are not enabled after remount. The problem was a combination of things, but mainly due to the gfs mount helper mount.gfs2 not passing the mount parameters on in the extras string during a remount. The mount helper was also incorrectly putting some messages into stdout. 2006-10-14 Benjamin Marzinski Make gnbd work with cman correctly. This sort of roughly falls under the heading of bz #210415 2006-10-14 Robert Peterson This is a fix for bugzilla bug 210300: Unknown mount option "users". The gfs and gfs2 mount helper (/sbin/mount.gfs2) was aborting if it saw mount options that are not part of mount.h (i.e. internal to mount and vfs). The fix is to add the missing options so the mount helper will recognize them properly. 2006-10-13 Chris Feist We don't want to delete the scsi_reserve init script when doing a make clean. 2006-10-13 David Teigland Fix an effect of recovery mixed with joins where the node whose join event was interrupted by the recovery can sometimes not have its g->joining flag cleared which would cause a later unmount to hang. The corresponding changes to the gfs_controld changes in handling mixed mounts and recoveries and failed mounts. We now tell gfs_controld when our mount has completed and the result using the same connection that we created when requesting the mount. Handling a lot of hard situations in the areas of: - recoveries mixed with mounts in lots of different ways - mount failures while lots of nodes are mounting in parallel (Part of this is also an update to mount.gfs, both gfs_controld and mount.gfs need to be updated together.) 2006-10-13 Robert Peterson This is the fix for bugzilla bug 210587: Oops in gfs_get_dentry via NFS. The gfs file system, when called by NFS, was sometimes referencing the vestigial license file, causing the segfault. 2006-10-13 David Teigland replace spaces with tabs 2006-10-13 Ryan O'Hara Remove unnecessary chmod for scsi_reserve. 2006-10-13 Robert Peterson This fix is for bugzilla 210641: Race condition hang/failure between cman daemons and groupd. Added a retry with timeout to group_init and all its callers. 2006-10-13 David Teigland If cpg_join or cpg_leave are stuck in a retry loop, put an error message in syslog after ten seconds. 2006-10-13 Robert Peterson This is for bugzilla 210162: fence_tool needs -w and -t options to wait for group membership. 2006-10-13 Lon Hohberger Ancillary patch to fix 202492 and actually add back groupmember attr, not just rgmanager (per-node) attr 2006-10-13 Ryan O'Hara Add code to create initdir if it doesn't exist. 2006-10-12 Chris Feist Added changes to support installing init scripts w/ brew build. 2006-10-12 Ryan O'Hara Remove scsi_reseve from "all". This will be handled by the agent make target. Add scsi_reserve init script to Makefile so that it gets installed. 2006-10-12 David Teigland Handle the case where we're the second node being added to the group and the only other member fails. We need to go ahead and process our join. 2006-10-12 Lars Marowsky-Bree Fix missing character in comparison. --HG-- extra : convert_revision : 6435f97e4eb2ea0cc7771fab386f65602a0e86af IPaddr2 only works on Linux. --HG-- extra : convert_revision : 94a31585bde5a726cf0be7c4523cc26072fa0975 Merge with crm-stable --HG-- extra : convert_revision : 5508d686f22199cb570cee79445c797d60771985 2006-10-12 Andrew Beekhof RA: IPaddr cant run on Solaris, loose the compatability code --HG-- extra : convert_revision : e29d104e20f4bc60b5d73a70a78c21d4a1acaaa0 2006-10-12 Benjamin Marzinski Change the way gnbd notifies multipathd about device changes, to deal with the new udev. 2006-10-11 Andrew Beekhof RA: Patch to allow ServeRAID to function in the real-world. From Jon Fanti. On start, if we're already started then there is nothing to do. No need to do an unmerge and remerge. Exit with the correct exit code for status/monitor ops Support ServeRAID Command Line Interface v7.12.05 (771) Removes "GROUP" string from opssend command-line --HG-- extra : convert_revision : ee41208b0b0fa6dd9a7838f48c62da23558395a6 2006-10-10 Patrick Caulfield Avoid spurious messages. and also fix an odd node count when nodes rejoin. 2006-10-10 Marek 'marx' Grac Script for parsing Tomcat's conf/server.xml Bug #204784. Adding Tomcat resource agent 2006-10-10 "Huang Zhen ext:(%22) dont build quorumd if gnutls is not available --HG-- extra : convert_revision : 012bb42691b23afff4365991f6225669987ccdd1 2006-10-09 David Teigland if we get a plock request from the kernel when plocks are disabled, return -ENOSYS for the request add -p option to completely disable plocks/ckpts 2006-10-09 Patrick Caulfield If there are disallowed (AISONLY) nodes in the cluster, then name & shame them. Don't fence a node if it has already been fenced. bz#204633 2006-10-06 Lon Hohberger Fixed 202492, not 202497... Fix #202497 2006-10-06 David Teigland This is a big batch of code that gets us further along the path to handling recoveries mixed with joins (gfs mounts). The test I've been using to work on this is inserting a BUG() at the start of gfs_lm_get_lock() on six of eight nodes and then mounting on all of them in parallel. We should end up with the two nodes without the BUG properly mounted and the six with the BUG properly recovered. - check cpg flow control status from openais when processing plocks - handle case where we're mounting and the only other mounted node fails -- we need to become the first mounter if we've not begun mount(2) yet - journal recovery requests need to be fed serially to gfs, we weren't doing that in the case where a gfs journal recovery was in progress when another node failed make the number of clients a global variable so it will be easier to add clients later 2006-10-06 Andrew Beekhof Admin: give the cluster shell a less generic name --HG-- extra : convert_revision : 63881869060ab774a656716e03a6dfaace4c9f04 2006-10-06 Chris Feist Update building for xvm fence agent to build cleanly in brew. 2006-10-05 Lon Hohberger Fix #208115 Add --enable_xen configuration option (off by default), and make sure -V flag works for fence_xvm[d] 2006-10-05 David Teigland don't configure gfs-kernel or gnbd-kernel now that they're not built by default update gfs-kernel (gfs1) and gnbd-kernel are going to track the RHEL5 kernel in cvs head. We want the default top-level build of cvs head to work on upstream kernels, though, e.g. 2.6.19, for people who are trying out upstream gfs2/dlm. So, comment gfs-kernel and gnbd-kernel out of the top level makefile. Once we create a RHEL5 branch, we can uncomment them there (and perhaps remove gfs-kernel and gnbd-kernel from cvs head.) 2006-10-05 Lon Hohberger Implementation of client/server based Xen Virtual Machine (xvm) fencing. This allows fencing of a virtual machine from any other virtual machine in the cluster (regardless of the physical host) which shares the same private key, either based on UUID or Xen domain name. Please see README and TODO before posting feature requests. 2006-10-05 Chris Feist - Added in fixes to make gfs-kmod compatible with the RHEL5 kernel - removed inode->i_blksize references - Using i_private instead of u.generic_ip in the inode struct 2006-10-05 David Teigland updates 2006-10-05 Patrick Caulfield A bit of a hack to cope with the race condition where dlm_controld gets the groupd callback before the cman one and tries to start a DLM lockspace before all the node addresses are known. I think this will fix bz#207197 Add some extra semantics to CMAN to cope with openAIS rejoins. Basically, this adds an extra state to a node: AISONLY which is only cleared when cman receives a valid TRANSITION message from the node. A TRANSITION message is deemed to be invalid if the join_time of the node has not been changed (this is the timestamp the daemon was started) and the node has since been down and is rejoining. cman_tool will show if this is the case for a cluster by displaying the DisallowedNodes flag in the "cman_tool status command". If there are disallowed nodes in the cluster then the "cman_tool expected" command is disabled until those nodes have been removed. 2006-10-05 Andrew Beekhof RA: Patch to oracle agent from Dejan Muhamedagic - sqlplus in some Oracle 9 installations produces some superflous output - wrong status op in case one database sid is a prefix of another --HG-- extra : convert_revision : 33bb352cf17e0cb43d660bc5cdf060146ed5b89d RA: Patch to pgsql.in from Serge Subrouski 1. Removed checking for PostgreSQL process name for compatiblity with PostgreSQL 8.2 2. Added optional pghost parameter that allows monitoring PostgreSQL on a particular IP Address. --HG-- extra : convert_revision : aa0b36caa635fe8dc65e8ad7b26989ab40068599 2006-10-04 Robert Peterson Add -w option back to fence_tool join in cman init script. Add the "-w" (wait) and "-t" (timeout) parameters back in to fence_tool. 2006-10-04 Marek 'marx' Grac This patch pushes generated configuration files for service in /etc/cluster/ (RA_COMMON_conf_dir) where each service (samba, openldap, ...) has it's own directory. In this directory is another directory with instances (OCF_RESOURCE_INSTANCE). Our generated configuration files are not re-generated when user changes them, that' reason why there are in /etc and not in /var. 2006-10-04 David Teigland set the "member" field in the group_data struct that's returned when querying for group information 2006-10-04 Andrew Beekhof Merge with crm-stable --HG-- extra : convert_revision : 24f2d2b1387fa10d858b5847828bd4d61935d617 2006-10-03 Lon Hohberger Fix #208577 2006-10-03 Ryan O'Hara Added gfs_security_init to initialize SELinux xattrs for newly created inodes. 2006-10-03 Andrew Beekhof RA: Two new resources from Matthias Dahl for managing OpenVZ VEs and RAIDs --HG-- extra : convert_revision : 34e3d1d9ba991406c7288818ec9b324c40be9e35 RA: Implement validate-all for pgsql - Courtesy Serge Dubrouski --HG-- extra : convert_revision : a905368a9f38a5ec95348a9f87a86424c0bcd3a8 2006-10-02 Ryan O'Hara Add GFS_EATYPE_SECURITY as valid xattr type and increment GFS_EATYPE_LAST. Without this gfs_fsck will complain (and remove) SELinux xattrs. 2006-10-02 Patrick Caulfield Don't even start up if the local host name resolves to 127.0.0.1 2006-10-02 Andrew Beekhof RA: Fix for OSDL 1422 - refresh interval off by a factor of 1000 --HG-- extra : convert_revision : 68d9618c955fd110ab89ff5b6634dfda644904fd Merge with crm-dev --HG-- extra : convert_revision : 9e72d59c1aae66b1ffbf7d20460560e229144da4 2006-10-01 Andrew Beekhof Make sure ocf-returncodes is present in the dist tarball --HG-- extra : convert_revision : b1e3e8d85b7820f5ad095a71e48bda8d6a6d8133 Revert inappropriate fix for an rpm build issue --HG-- rename : heartbeat/ocf-returncodes.in => heartbeat/ocf-returncodes extra : convert_revision : 413a6475fbb8c6da378a28ad7328a0749956dd39 2006-09-30 "Huang Zhen ext:(%22) add ocf-returncodes to the scrpit list --HG-- extra : convert_revision : b845773f185873af9d283dc2af2ee96ff8745374 rename ocf-returncodes to ocf-returncodes.in --HG-- extra : convert_revision : 95afeef95dc30e77dfa6ed7dc53ed96394a1f60b rename ocf-returncodes to ocf-returncodes.in --HG-- extra : convert_revision : eb28a7552991f2cb0eee0a9fd4e32d06ed83a56a 2006-09-29 Marek 'marx' Grac Test if PID file of the application points to running PID. If not then this PID file is deleted and application can start. 2006-09-29 Andrew Beekhof RA: Better handling of return codes in pingd_stop() --HG-- extra : convert_revision : b8fa057ebf831ff8475b34ddaf6dc5bb3691e704 2006-09-29 Marek 'marx' Grac Some application needs time until they stop all theirs processes, so we have to wait a few moments until main/parent process is finished. This patch adds an option 'shutdown_wait' for each application's RA. 2006-09-28 Lon Hohberger Fix segfault due to missing param 2006-09-28 David Teigland put a message in syslog if we get a cpg error that we can't deal with 2006-09-28 Abhijith Das memory violation 2006-09-27 Lon Hohberger Fix 202498 Clean up build Apply patch from Fabio M. Di Nitto to fix clustat service name expansion bug Fix various bugs, incl. 208011, 203762 2006-09-27 "Huang Zhen ext:(%22) pull from hg.linux-ha.org/dev --HG-- extra : convert_revision : 37a08a5c9b37a8bc7d84a49e39eca6ba18293579 2006-09-27 Lon Hohberger Fix failed->disabled state transitions; #208011 2006-09-27 Marek 'marx' Grac Add check if the instance of RA has parent (variable service_name) Adding Samba resource agent (tag ). We already have resource agent for Samba but this is written in the same way as the other application's RA (mysql, apache, ...). Old-style RA stays available (tag ) so it won't break backward compatibility. Adds new function (generate_name_for_pid_dir()). Minor update of messages texts. 2006-09-26 David Teigland Adding -vv to the groupd command line will result in a log_debug for every cpg send and every cpg recv. Add debugging in four areas to help us know more quickly when something might be wrong at the cpg level: - log if cpg flow control goes on - log when we're waiting to receive a cpg event for our own join - when we're in a FOO_STOP_WAIT or FOO_START_WAIT state, log how many more cpg messages we're waiting to receive before moving on to the next state - save the event id of the last cpg message we sent, and clear that value when we receive that message back (this value is printed to the debug log when someone runs group_tool, not shown in the group_tool output) 2006-09-26 "Huang Zhen ext:(%22) add quorumd --HG-- extra : convert_revision : 3b915a887b9926a1bb01691f8440a5c4614c5dae 2006-09-26 Andrew Beekhof Merge with crm-dev (PEngine re-write) and other fixes --HG-- extra : convert_revision : 7eda8f01a9cd0d27e3865850f7648232b25db1a3 Dont shadow start_delay Consoldate the delays into one variable (this is an example RA only) --HG-- extra : convert_revision : 2cb5947970511135677db48a5cd7cd74505a1028 2006-09-25 Patrick Caulfield Add struct entry for .flow_control to keep latest openais happy. 2006-09-22 Patrick Caulfield Cope with a node being fenced manually and then going offline (ie someone else fenced it).. Some further modifications to fenced can then prevent double-fencing with the following caveats: - the clocks on the nodes are reasonably in sync - the node goes down within two minutes of the fence message being received If there is any ambiguity then cman will mark the node as unfenced so that fenced can do the job anyway. 2006-09-22 Andrew Beekhof Add a manpage for crm_resource Contributed by Gildas Le Nadan with the permission of his employer and under the terms of the GPL --HG-- extra : convert_revision : 1bb39a52441116a4885c17e891e094766738931c 2006-09-21 Benjamin Marzinski Fix for bugzilla #207599. The individual gserv processes inherit the atexit callbacks from the main gnbd_serv process. One of those kills all the gserv processes. Now they don't do that. 2006-09-21 Lon Hohberger Apply resource-instance-name.patch 2006-09-20 David Teigland Get lm_interface.h from the kernel instead of keeping a duplicate copy here. Requires recent upstream gfs2 change that moves lm_interface.h from fs/gfs2/ to include/linux/ 2006-09-20 Robert Peterson Addendum to bz 200883. If gfs2_fsck can't finish initialization, it was exiting but not fixing the lock protocol back for normal use. Addendum to bz 200883. If gfs_fsck can't finish initialization, it was exiting but not fixing the lock protocol back for normal use. This is a crosswrite from gfs1 for bugzilla bz 200883: gfs_fsck segfaults on very large file systems. The same problem existed and is now fixed in gfs2_fsck and libgfs2. This is the fix for bugzilla 200883: gfs_fsck segfaults. The problem was that gfs_fsck was running out of memory for in-core bitmaps when run on very large file systems. For example, 45T requires about 11GB of memory. This fix doesn't allow it to run, this just exits gracefully, tells them why, and how much additional memory is needed. 2006-09-20 Marek 'marx' Grac After upgrade to 'unified names for PID files' we can clean code a bit. Adds possibility to add command line options to MySQL RA. Names of variable in RA's metadata are changed to unify style. Adds possibility to add command line options to Apache RA. Names of variable in RA's metadata are changed to unify style. Bug #204058. Adding resource agent for PostgreSQL 8 2006-09-19 Jonathan Brassow lsnodes -> lsnode typo. 2006-09-18 Chris Feist - Fix for bz #206325, ccs should not be started with the '-X' option & return the socket file descriptor instead of '0' when returning from ccs_open. 2006-09-18 Marek 'marx' Grac PID files are stored in common directory. Name of the PID file is generated from the OCF_RESOURCE_INSTANCE. Resource agents for Apache, MySQL and OpenLDAP are updated. 2006-09-15 David Teigland positive return code from recover_current_event() should just indicate that the event should be processed again, and not added to the return value of process_app() which causes the whole thing to be called in a loop have groupd set the scheduler to RR priority 2, same as gfs_controld 2006-09-14 David Teigland Fixes a really stupid bug checked in yesterday that causes groupd to seg fault due to referencing a pointer that's not been set yet. handle short/interrupted writes/reads 2006-09-14 Benjamin Marzinski file log.h was initially added on branch RHEL4. file log.c was initially added on branch RHEL4. 2006-09-14 Marek 'marx' Grac typing error Simplifying scripts: The basic method of monitoring service is to check for PID file and test if we have such process. This function should be used by every RA for application. 2006-09-14 Patrick Caulfield Cope with short writes to the cman socket. bz#206093 2006-09-14 Marek 'marx' Grac Bug #204060. Adding OpenLDAP resource agent 2006-09-13 David Teigland update per the gfs2 upstream changes to the lock module interface: - remove sync_lvb - remove lm_lock_t, lm_lockspace_t, lm_fsdata_t typedefs 2006-09-12 David Teigland Use the event_nr arg provided in start_done to check if the start_done callback should be ignored; were ignoring the event_nr. The check of the current event state covered it, but ended up producing an unnecessary warning in syslog. undo junk mistakenly added by last commit remove stuff from dlm/nolock/harness since it all comes from upstream now 2006-09-12 Patrick Caulfield Fix strdup braindamage that probably caused segfaults when nodes left the cluster. This is likely to be the cause of bz#206083 (thanks to Steven Dake for most of the diagnostics on this). 2006-09-09 David Teigland use same retry delay on cpg sends as gfs_controld, usleep(1000) between each retry - minor change to the delay we add between each cpg_mcast retry - set scheduler to RR priority 2 for gfs_controld 2006-09-08 Andrew Beekhof Fix prior merge with stable - create extra crm shell files during configure --HG-- extra : convert_revision : 182225b60328b0f8a94807e629d25bfbe968bb0e Merge with crm-stable --HG-- extra : convert_revision : 22085d5b2aedf5a400a42da85a1e1ebabf4502cb Refine logfile handling for pgsql RA Patch supplied by Serge Dubrouski --HG-- extra : convert_revision : 7ed56331433a68d6d65576c443ed07bec944ac6f 2006-09-07 David Teigland handle short or interrupted reads/writes, an snprintf instead of sprintf, strtoul instead of atoi, handle an ENOMEM no void arg in dlm_get_fd prototype was causing warnings 2006-09-06 James Parsons Support for DRAC ERA 2006-09-05 Huang Zhen pull from Andrew's repository --HG-- extra : convert_revision : b370f606b7dc7fbdb96cbb968a2cb9d045a9cb76 2006-09-04 Andrew Beekhof Split OCF return codes off into their own file --HG-- extra : convert_revision : c9faca25f214f1ddc401fa1252de870f2f1f3e66 2006-09-01 Lon Hohberger 2006-09-01 Lon Hohberger * include/resgroup.h: Add proto for rg_strerror * include/vf.h: Add proto for vf_invalidate (flushes vf cache) * src/clulib/rg_strings.c: Add rg_strerror function, define human-readable strings for rgmanager error values * src/clulib/vft.c: Add vf_invalidate (separate from vf_shutdown) * src/daemons/groups.c: Fix obvious logic error * src/daemons/main.c: Fix rg_doall() message during loss of quorum. Invalidate local VF cache and kill resource configurations on loss of quorum (#202497). Send RG_EQUORUM back to clustat/clusvcadm so that they report why they can't get information. Don't queue status checks if we've lost quorum. Add command line parameter to disable internal crash watchdog * src/utils/clustat.c, clusvcadm.c: Handle SIGPIPE, and produce useful errors if possible. 2006-09-01 Patrick Caulfield Rename 'private' to 'privdata' so it doesn't upset C++ 2006-08-31 David Teigland convert write(2) calls to use do_write() which handles EINTR and short writes When deciding whether we need to unlink the checkpoint and resend journals for a failed master node we weren't distinguishing between the master failing (where we need to do this stuff) and the master just leaving (where we don't). tidy up a couple style things when we set a recovery event back to the FAIL_BEGIN state, make sure that we process the event once before processing any new messages. this is probably a better fix for bz 202635 than I added previously where we accept messages more liberally i.e. in X_BEGIN states. 2006-08-31 Robert Peterson This is a fix for Bugzilla Bug 203916: groupd daemon segfault and mount hang when mounting five or more GFS file systems. 2006-08-31 David Teigland - break from snprintf loop when buffer is filled - handle some odd error cases like EINTR - handle short writes 2006-08-31 Ryan O'Hara ccsd is now fixed such that it will not daemonize until the socket is ready for communication. As a result, the sleep after starting ccsd is no longer needed. Moved code which signals parent (SIGTERM), which allows the parent process to continue and exit. This signal was occuring before ccsd had the sockets ready for communication, and as a result the cman init script would sometimes fail because the ccsd would daemonize before the socket was ready. This fix will not signal the parent until ccsd is ready (socket is created and ready; before select() loop). 2006-08-31 Marek 'marx' Grac Minor changes. Bug #204057. Adding Apache resource agent and utility which parse httpd.conf. Bug #204054. Adding MySQL resource agents and utilities which will be common for other RA. Fix #203720. Do not run backup copies (ends with ~) of resource agents. 2006-08-30 Abhijith Das fix for bz 190204. gfs2_jadd uses the gfs2meta filesystem to add journals to an existing gfs2 fs 2006-08-30 Ryan O'Hara Remove error handling for missing magma plugins. 2006-08-30 Andrew Beekhof Merge with heartbeat - Changes split out the GUI into a new package --HG-- extra : convert_revision : a08479025e32eacb757f3ccf81e729d87bd7e166 2006-08-30 Huang Zhen change hasbin_PYTHON to hasbin_SCRIPTS --HG-- extra : convert_revision : 2542fe89188540e2c446f35ec490ed01caaa8e42 change hasbin_PYTHON to hasbin_SCRIPTS --HG-- extra : convert_revision : e33357461fcaf92ad7bf242410ced5cba189c343 2006-08-28 Chris Feist Create symlinks for mount.gfs & umount.gfs. The gfs package should be installing the umount/mount.gfs links. 2006-08-26 Abhijith Das fix for bz 203167 and bz 202984. stop_fence was commented out. Now we do stop_fence before doing a cman_tool leave. 2006-08-25 Andrew Beekhof A new MySQL resource agent Based on the work of Jakub Janczak which was in turn based on the db2 script from Alan Robertson. --HG-- rename : heartbeat/db2.in => heartbeat/mysql.in extra : convert_revision : 38ebee853f2693b2c5abb9d0a0453f44bd8ce109 2006-08-24 Andrew Beekhof Use the non-CIDR netmask form because Linux wont accept the new notation Add the ability to accept either netmask notation to findif --HG-- extra : convert_revision : b91c7d5049fce99731bf93f128a18b31911b740b 2006-08-24 Patrick Caulfield initialise confchg_callback 2006-08-22 Chris Feist Don't force the owner to root (breaks rpm build). Added gfs_ondisk.h to allow builds outside of tree. 2006-08-22 David Teigland When we're in X_BEGIN state, accept "stopped" messages from other nodes. This applies to bz 202635. (There may be a better way to address this, e.g. forcing a new FAIL_BEGIN event to be processed before processing any messages) 2006-08-22 Andrew Beekhof PureFTP OCF agent from Rajat Upadhyaya of Novell --HG-- extra : convert_revision : f1e1764d83a6e098804078552e8f9b42bcd5ea28 2006-08-21 Chris Feist - Install the init script in the correct place. - Change includes for gfs_ondisk.h & gfs_ioctl.h. Copied gfs_ondisk.h from gfs-kernel to allow builds to succeed. 2006-08-21 David Teigland expand the number of cases where we don't tell gfs-kernel to do recovery because it won't be able to -- esp cases related to a mount in progress but not yet far enough for gfs to be able to do journal recovery 2006-08-21 Robert Peterson Get gfs_ondisk.h from local includes, not kernel includes. 2006-08-21 David Teigland - the check for us becoming the new low nodeid after the previous one failed and unlinking the ckpt wasn't adequately checking for the old low node having failed - rename low_finished_nodeid to master_nodeid and clarify some of the code using this since it was confusing and misleading 2006-08-21 Andrew Beekhof Reorganise how for the crm shell files are installed --HG-- extra : convert_revision : ef990c13786dacf446a663451696b595a83dd380 2006-08-21 Lon Hohberger 2006-08-21 Lon Hohberger * src/daemons/main.c: Fix #202500 - simultaneous starts confuse rgmanager. This happened due to the fact that rgmanager was not correctly determining port listening status of other nodes on the first pass, and subsequent attempts to determine status of other nodes were not tried. 2006-08-21 Andrew Beekhof Merge in fake 2.0.2 release --HG-- extra : convert_revision : 9837dae29736bb85811c11f527ccb066a32e0480 Represent 2.0.2 as it went out using CVS --HG-- extra : convert_revision : 47df73aa08d5d100fd14320220905fc179621d70 2006-08-18 Lon Hohberger Fix 200776 - mixed up default log level constants 2006-08-18 David Teigland when the low nodeid fails, the checkpoint needs to be unlinked, otherwise creating the ckpt will fail down the road when another node mounts 2006-08-18 Lon Hohberger 2006-08-18 Lon Hohberger * include/resgroup.h: Change ordering and add magic field to rgmanager state field (warning: breaks compatibility from 08/08 CVS!) * src/clulib/ckpt_state.c, src/daemons/rg_state.c: Fix bug preventing correct operation of ckpt operation after initial boot. Get rid of debug info. * src/daemons/groups,c, main.c: Fix #202499 - shutdown while handling transitions sometimes allows services to restart (due to not locking RGs locally) * src/resources/clusterfs.sh, fs.sh, nfsclient.sh: Add proper warning messages if status check fails * src/resources/ra-api-1-modified.dtd: Allow 'migrate' option 2006-08-18 panjiam tsa plugin startup script CVS patchset: 10346 CVS date: 2006/08/18 05:10:10 --HG-- extra : convert_revision : ac39f6de23cfeb07b590849317ccf92dd8d4bd22 2006-08-17 David Teigland change debug messages related to storing/retrieving plocks to/from checkpoints to see more details about the ckpt 2006-08-17 Patrick Caulfield Fix a bug in the demo prog. no point in setting a bad example (again) Add a confchg callback to libcman, similar to the openAIS ones. this gives a race-free notification of cluster change deltas and will probably simplify client code hugely. (or it would if most of it hadn't already been written!) 2006-08-17 panjiam added testrun.sh script CVS patchset: 10344 CVS date: 2006/08/17 08:32:23 --HG-- extra : convert_revision : 24d77d05a1d3602974fa986c3a5069edc7396aba added options for TSA plugin CVS patchset: 10335 CVS date: 2006/08/17 03:15:41 --HG-- extra : convert_revision : 377ddad809201c91b4ac97efb885f4bbb6d346a9 option to disable tipc build CVS patchset: 10333 CVS date: 2006/08/17 02:57:45 --HG-- extra : convert_revision : f7201d7973375d5cb8d27a78d82e504e5b88c372 2006-08-16 alan Bug 1407: Added configure option to disable the kludge needed to fix the problem. CVS patchset: 10332 CVS date: 2006/08/16 21:55:04 --HG-- extra : convert_revision : a38ebfe81caccfc27f9efd8994d045d706bca272 2006-08-16 David Teigland after unlinking a ckpt, don't try to close it if we don't have it open, (no big problem, the close would just fail) and go back to syslogging ckpt close errors 2006-08-16 James Parsons Ignore unused args to stdin 2006-08-16 David Teigland don't barf on unknown option arg 2006-08-16 James Parsons ignored unused args from stdin 2006-08-16 davidlee A little tidying and resilience improvement. CVS patchset: 10331 CVS date: 2006/08/16 17:26:12 --HG-- extra : convert_revision : b2b7db1320cf089f5b8a1f6e8f56688bfd781e0d 2006-08-16 David Teigland change log_plock() to log_group() for packing/unpacking plocks in checkpoint 2006-08-16 davidlee Restore part of 1.46 accidentally lost at 1.49 CVS patchset: 10329 CVS date: 2006/08/16 11:40:50 --HG-- extra : convert_revision : 0ef25d089b162cc228c0c783f899e4395c072417 Ensure 'tr' usage portable across different implementations. CVS patchset: 10326 CVS date: 2006/08/16 09:20:16 --HG-- extra : convert_revision : 9940d530a4e06b913cca36708572d7da01b3943e 2006-08-16 zhenh A typo error CVS patchset: 10322 CVS date: 2006/08/16 09:04:18 --HG-- extra : convert_revision : 5010206251a0fde32bb40c3bedbaaadd94fa59d0 2006-08-16 Robert Peterson Fixed segfault in gfs_controld. 2006-08-15 David Teigland errors opening sysfs files are normal/expected in many cases, so don't complain in syslog about it don't barf on extra args don't barf on unused args daemons that depend on groupd (fenced, dlm_controld, gfs_controld) should log and error and exit if groupd dies 2006-08-15 Patrick Caulfield "group" should be "amf" At startup, check that ALL nodes in CCS have nodeids assigned. If not, then refuse to start. 2006-08-15 lars From: Keisuke MORI Pgsql fails to launch PostgreSQL unobviously even if it's configured properly. The cause is that PostgreSQL log file /var/log/pgsql.log, which is hard-coded in the script, is not usually writable to PGDBA user (at least on RedHat EL4), and pgsql just returns with no obvious log messages when it failed to create the log file, so users will be puzzled why it failed. * Solution: This patch fixes it as follows, and also fixes some typos: - make the log file path configurable as an OCF parameter "logfile" in cib.xml. - try to create the log file with PGDBA user and leave an error log message if it failed. - default logfile path is now /dev/null. I made it so because 1) an appropriate path largely depends on distros' or an user's policy and 2) the default of PostgreSQL init.d script which comes with RedHat is also /dev/null. Acked-by: horms@verge.net.au Acked-by: lmb@suse.de CVS patchset: 10307 CVS date: 2006/08/15 10:22:02 --HG-- extra : convert_revision : f487661c8f889473db8ce465b7d732554b8e1fd9 2006-08-14 David Teigland Code that starts groups in order of level during recovery wasn't working right in the case where a node fails while mounting, i.e. node fails after it's joined the level2 mountgroup but before it joins the level1 lockspace. Code now checks that all lower levels are recovered instead of just checking that level-1 is recovered. remove a couple log_error's 2006-08-14 msoffen Fixed so that SSH and REBOOT are taken from the config settings. CVS patchset: 10193 CVS date: 2006/08/14 20:00:55 --HG-- extra : convert_revision : f09756d70909679b6aefb82d27f3d180c38e9bc8 Added Stateful. CVS patchset: 10190 CVS date: 2006/08/14 19:56:30 --HG-- extra : convert_revision : d640e2f44896a89b191e3d5c42075e15de750c84 2006-08-14 David Teigland show all options in help output There's been a relatively unusual problem explained in the comments that I'd been putting off fixing for lack of a nice solution. Turns out this problem could crop up more often than hoped, so have had to fix it. 1) mount.gfs asks gfs_controld to join mount group 2) gfs_controld does and notifies mount.gfs to go ahead with mount(2) 3) gfs_controld gets a stop callback for the group due to another node mounting 4) gfs_controld needs to wait for the kernel mount to complete before it can stop/suspend the mount group (through sysfs) 5) mount(2) fails in the kernel for whatever reason 6) mount.gfs tells gfs_controld the kernel mount failed gfs_controld is waiting for the kernel mount to complete outside its normal poll loop, though, so it won't ever get the message in step 6, and will wait forever for the failed mount to actually complete. Added a pipe between mount.gfs and gfs_controld that mount.gfs just uses to send a failed mount message. gfs_controld watches the pipe for this error message while waiting for the kernel mount. mount.gfs uses unix socket ancilliary data to send an fd to gfs_controld. 2006-08-14 andrew Indicate when we're doing a deeper monitor action CVS patchset: 10189 CVS date: 2006/08/14 16:32:53 --HG-- extra : convert_revision : df68093a736c8193d30efe8537c9381c7b7a69f9 2006-08-14 lars Remove removed file. CVS patchset: 10184 CVS date: 2006/08/14 13:13:33 --HG-- extra : convert_revision : 0ab3250ec9adea1d52fda57a499d2711d6a4263e 2006-08-14 andrew Replacement stateful resource agent that is suitable to be used as a template CVS patchset: 10182 CVS date: 2006/08/14 12:57:29 --HG-- extra : convert_revision : 964d95a0186a10802adb93180222bb27a68a8e5f This is set for all heartbeat RAs... so it seems that its required and it would make sense to set it centrally CVS patchset: 10180 CVS date: 2006/08/14 09:39:06 --HG-- extra : convert_revision : e4096939dcaea45f7ac6fe6e4256ed2b4907f31a Cleanup prompted by inability to add an IP in under 10s on a regular basis - Fix indenting - Remove dead code - Remove evil code - Better use of SYSTYPE switches - Add required abstraction for BSD variants - More efficient operation - Use configured values when supplied - Complain when configured values dont match the results of findif - Move all validation to validate-all and only autmatically call it before start - Reduce unnecessary logging - Do not use ping by default (do so only when OCF_CHECK_LEVEL > 0) More to follow... CVS patchset: 10177 CVS date: 2006/08/14 09:33:47 --HG-- extra : convert_revision : 855d6fdd37e06b7b8925eef8078dc8ebdbb80617 Fix resource agent info CVS patchset: 10174 CVS date: 2006/08/14 09:12:46 --HG-- extra : convert_revision : daa73e3b4443e85d127cfc9876777648d19695c9 Remove the hideous OCFMSDummy agent that timed out frequently and didnt really know what it was doing. Replace it with something that works and can sanely be used as a template by others CVS patchset: 10172 CVS date: 2006/08/14 09:09:14 --HG-- extra : convert_revision : ae641a048d9eca1a288e0897bd97a55c9a439589 2006-08-14 Robert Peterson Fixed segfault converting bitmaps during inode conversion. 2006-08-12 Robert Peterson Reset other inode bits when temporarily setting S_IFDIR bit. 2006-08-11 David Teigland report mount failure debug message earlier 2006-08-11 Robert Peterson This change is for Makefile reform allowing a simple "make" command to recompile the entire cluster suite: 1. All Makefiles have been changed to get rid of references to "copytobin" and the bin directory. This will eliminate discrepancies between "bin" versions and installed versions of the programs. 2. The cman and group configure files have been modified to allow linking properly to /usr/lib64 on x86_64 systems without specifying libdir. 3. Fixed several problems relating to "make install" not recompiling modified code. 4. Fixed some dependency problems with gfs and gfs2 tools that linked against libgfs and libgfs2 respectively. 5. All Makefiles have been updated to use "make -C " rather than "cd ; make" so that compile errors won't charge ahead without stopping you. 6. Deleted references to obsolete iddev library. 7. Got rid of more "linux" symlinks for includes. 8. Misc minor Makefile cleanup. 2006-08-11 Lon Hohberger Apply Navid's patch to -head 2006-08-11 Robert Peterson Get rid of symlink "linux" for referencing includes and use the correct lm_interface.h. 2006-08-11 Patrick Caulfield Create a pipe between cman_tool and the cman daemon so that it can communicate back any failures that occur during initialisation. This should help debug any problems people have with cman appearing to die straight after startup. Set a good example by checking return values. We don't really need to include signal.h twice :) 2006-08-10 Robert Peterson Make block_list use a consistent set of values rather than enum values in one place and #defines in another, and trying to keep them in sync. Fix include gfs_ondisk.h to be located in gfs kernel source rather than in the kernel includes. Fix minor compile problem due to missing include. 2006-08-10 David Teigland log_debug() when we receive a withdraw message 2006-08-10 Robert Peterson Mounting was mistakenly allowed with too few journals. 2006-08-09 Lon Hohberger Fix relocation & transition handling 2006-08-09 Robert Peterson Fix compile error with vmalloc. Hex values were not shown or printed correctly on x86_64. 2006-08-09 David Teigland don't send plock debugging to stdout with -D, use -P to get that now 2006-08-09 James Parsons addresses bz193065 Removed BULL refs from man page 2006-08-09 msoffen Added oracle, oralsnr, and pgsql. CVS patchset: 10160 CVS date: 2006/08/09 13:55:51 --HG-- extra : convert_revision : 0f3b3fbe0a6c52420993d88c5fa4aa2fafbff779 2006-08-09 lars Resource Scripts: Make sure that SYSTYPE is quoted This is probably a bit paranoid, but $SYSTYPE really ought to be quoted when used in if-then-else clauses. I'm not sure about case, but there was one quoted instance, so I made them all quoted. I can make the reverse change if that is prefered. Signed-Off-By: Simon Horman Signed-Off-By: lmb@suse.de CVS patchset: 10158 CVS date: 2006/08/09 13:01:54 --HG-- extra : convert_revision : 7cd58a2d3b34689e9a3ad997004baf91ce41a43f Resource Scripts: Set SYSTYPE early in IPsrcaddr SYSTYPE is actually used in a few places, but is set inside of ip_status(). I couldn't convince myself that it broke anything, but there is clearly scope if ip_status() isn't called early enough now or in the future. This patch sets SYSTYPE right near the top of the script. Signed-off-by: lmb@suse.de Signed-Off-By: Simon Horman CVS patchset: 10157 CVS date: 2006/08/09 13:00:19 --HG-- extra : convert_revision : 742140ae20e5dcdc2a68988f80d5cd0c6b4fd348 IPAddr2: Make sure SYSTYPE is defined as its used This is a fix for a problem reported by Benoit Donneaux. I have idea when it was introduced, but it seems kind of bad. http://permalink.gmane.org/gmane.linux.highavailability.devel/1242 http://www.osdl.org/developer_bugzilla/show_bug.cgi?id=1397 Signed-Off-By: lmb@suse.de Signed-Off-By: Simon Horman CVS patchset: 10156 CVS date: 2006/08/09 12:58:14 --HG-- extra : convert_revision : 618d24d4a881abae5d3c1ff124c9179537373b99 2006-08-09 Patrick Caulfield Some systems need #include and who are we to deny them ? 2006-08-08 David Teigland The idea to have the last node that did the checkpoint try to reuse it even if it wasn't the low nodeid any more doesn't work because the new mounter tries to read the ckpt when it gets the journals message from the low nodeid before the ckpt is written from the other node. Now, the low nodeid is always the one to create a ckpt for a new mounter which means a node saving the last ckpt needs to unlink it when it sees a new low nodeid join the group. 2006-08-08 Benjamin Marzinski setting multiple locations for gnbd_get_uid to check for scsi_id, and updating the man page. Patches from Fabio pull devfs stuff out of gnbd. 2006-08-08 David Teigland if a node has a saved ckpt when it unmounts, it needs to unlink it so another node can create a new ckpt for the next mounter use the correct (global) handle when unlinking a checkpoint 2006-08-08 Lon Hohberger Fix parameter ordering for calling cman_send_data_unlocked 2006-08-08 alan Minor tweak to make configure happy CVS patchset: 10150 CVS date: 2006/08/08 12:41:54 --HG-- extra : convert_revision : 734d65e07c1df8e11d1f9ff2ceaa09540f04d66d 2006-08-08 Lon Hohberger * src/clulib/ckpt_state.c: Preliminary implementation of replacement for VF using AIS CKPT B.02.01 (w/ built-in test program) * include/cman-private.h: Clean up APIs (cman APIs return cman_handle_t, which is void*, should be using void ** all over) * include/message.h: Bump context count to 128, add destination node ID in header of packets. * src/clulib/alloc.c: If we alloc the same size, return the same block * src/clulib/cman.c: API cleanups * src/clulib/message.c: Add error checking to msg_print * src/clulib/msg_cluster.c: Check destination in header before processing message remove dup #define for MAX_CONTEXTS, add proto_error() macro for displaying protocol errors. Use 'max' instead of 'fd' for select(). Use correct var when assigning contexts. Fix CMAN handles. Return correct size from msg_send() requests. * src/clulib/msgtest.c: Fix CMAN handles * src/clulib/vft.c: Don't handle VF_CURRENT inside comms thread * src/daemons/main.c: Check to see if nodes are listening on our port before we consider them running. Handle VF_CURRENT requests from other nodes. Fail if we can't determine local node ID * src/daemons/rg_forward.c: Give 10 minutes for responses to forwarded requests. * src/daemons/rg_state.c: Shorten RG state names. Fix 'Uncertain' output line. * src/utils/clustat.c: Fix ccs_member_list() function. 2006-08-07 alan Tried to fix problem with lynx not really being found... CVS patchset: 10148 CVS date: 2006/08/07 20:42:42 --HG-- extra : convert_revision : 0f979c4c1c67e4dc3d8ca517ce03a2e4cfd7aa98 2006-08-07 msoffen Fixed to start getting ldirectord to work on FreeBSD CVS patchset: 10145 CVS date: 2006/08/07 18:43:40 --HG-- extra : convert_revision : 13b10eba0ea3cbb5a0d8b96e981e2e9865032543 2006-08-07 David Teigland free all plock state for an fs when it's unmounted update lm_interface.h from version in git tree 2006-08-04 David Teigland Some basic stuff that I hadn't realized I'd not done back when first writing this: - purge plocks of failed nodes - implement get - write results back to processes waiting in the kernel bring lm_interface.h in sync with the version in gfs2 2006-08-04 Abhijith Das Continuing work on bz 195591. awk matching string for gfs and gfs2 was not right. Was causing the init scripts to go into a loop when both gfs and gfs2 fs were mounted 2006-08-04 Chris Feist Reverted changes to fix 64 bit arch building. 2006-08-03 lars First changelog for 2.0.7. Update configure version to 2.0.7. CVS patchset: 10128 CVS date: 2006/08/03 10:18:51 --HG-- extra : convert_revision : 42d51d6426eb836b18e32f85728757f2dace010b 2006-08-03 Robert Peterson Got rid of iddev references. The gfs2 userland tools weren't compiling when cluster configure was used because the gfs2kincdir was being overridden. 2006-08-02 David Teigland - complain and ignore checkpoint sections with a bad size - do checkpoint for new nodes if low node in charge of that failed before freeing a group struct, sanity check it's not referenced in any recovery sets do byte-swapping - checkpoint usage for plocks is getting closer, basic writing/reading of plock state to/from ckpt's works, but unlinking ckpt's and clearing open ckpt's from processes that exit don't appear to be working right in openais 2006-08-02 Lon Hohberger fix 200449 - status checks wrong 2006-08-02 Robert Peterson This is a fix for bugzilla bz 164499 (Unable to mount loopback images from mounted GFS partition). The previous fix had a problem where any writes to files in the file system would cause the problem to reappear. For more details see: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=164499 Fix compilation problems on x86_64 (link against /usr/lib or /usr/lib64 depending on archetecture). 2006-08-02 Patrick Caulfield if we can't get the latest config from CCS, poll it until we do. 2006-08-02 Wendy Cheng Bugzilla 199984: Increasing gt_statfs_slots tunable could significantly boost gfs "df" performance; e.g. set it to 128 from current default (64) could cut the "df" wall time in half with larger filesystem size. However, the kmalloc call within stat_gfs_async() has the possibility to fail due to increased gt_statfs_slots. There is really no need for this array to be on a piece of contiguous memory. Switch to vmalloc(). 2006-08-01 Robert Peterson Add useradd for ais user, added instructions for gfs (1). 2006-07-31 David Teigland 'group_tool dump plocks ' can now be used to display all plocks held in the fs - use nodeid and owner when checking the owner of a plock instead of just pid - this requires the recent addition of an owner field to the struct in the lock_dlm_plock.h kernel header - add ability to dump all the plocks to a client (group_tool) to display - add new code that uses the SA CKPT service to synchronize all the plock state for the group to a new node that joins the group, this is currently disabled until it's been tested and debugged 2006-07-29 andrew Close the parameters tag in the metadata CVS patchset: 10127 CVS date: 2006/07/29 20:28:21 --HG-- extra : convert_revision : 01ec8bce8a1afe4d842bc707cb0b369b9326ce99 2006-07-28 Robert Peterson OpenAIS builds for /usr/lib64/openais on x86_64 machines. We need to link against what it uses. 2006-07-28 David Teigland Update the cman member list every time we call is_member(). When called from the fence delay loop, we're not processing callbacks so our member list won't be updated as a result of a cman callback. 2006-07-28 Robert Peterson 1. Allow SIGINT signals so that gdb can break into hung mounts. 2. Remove multiple trailing slashes for directory and mount point. 3. Accept the -f option on umount that's sent to us during shutdown. 2006-07-27 Ryan O'Hara Early version of a script to help users determine if a logical volume might be in use by another node. Useful to avoid doing a mkfs or fsck on a mounted filesystem. 2006-07-27 andrew Allow SNMP to be completely disabled by configure CVS patchset: 10125 CVS date: 2006/07/27 12:25:46 --HG-- extra : convert_revision : 45544c436c996441ee896b074bfd5f1056dd50fa Make the Dummy RA conform to at least some of the OCF standard. CVS patchset: 10121 CVS date: 2006/07/27 09:19:36 --HG-- extra : convert_revision : 552d63c5ac625718c5a8d681cb4b2b1d50391bd5 2006-07-25 Robert Peterson Switch was specified incorrectly for apc power switch. 2006-07-25 David Teigland 'group_tool dump fence' will dump fenced's debug buffer keep 1MB circular buffer of debug messages that can be sent to a connected client From: fabbione@ubuntu.com This one was a nasty bug that was causing several issues. For example: mount -t gfs /dev/foo /mnt -> ok mount -t gfs /dev/foo /mnt/ -> nok failing with: can't find /proc/mounts entry for directory /mnt/ (caused by read_proc_mounts in util.c when comparing with /proc/mounts that does not reference the trailing /). Other bugs are also fixed by making mo->dir consistent. mount -t gfs /dev/foo /mnt -> ok umount /mnt/ -> nok: /sbin/umount.gfs: lock_dlm_leave: gfs_controld leave error: -1 because the mo->dir is also registered in lock_dlm daemon. This was causing a severe inconsistence that was blocking mounting/umounting or other volumes/devices. 2006-07-24 Ryan O'Hara Fixed typo. "ccstool" should be "ccs_tool". 2006-07-24 David Teigland remove duplicate \n from a couple log_debug/log_error have gfs2/Makefile install/uninstall mount and umount binaries itself rather than going through copytobin and having bin/Makefile install them (plan to remove copytobin from other dirs too) 2006-07-21 Robert Peterson Moved cman_tool from /sbin to /usr/sbin 2006-07-21 Lon Hohberger Add man pages for qdisk 2006-07-21 Patrick Caulfield Update to use new openAIS totemip & totempg APIs. Needs the Openais that's probably going to be released later today ;-) 2006-07-21 panjiam /sbin/ip is not packaged in iputils CVS patchset: 10112 CVS date: 2006/07/21 08:19:54 --HG-- extra : convert_revision : 384519fd59958a2defff68aea221c62e0b7798c2 2006-07-21 Abhijith Das gfs2 doesn't allow gfs2meta and gfs2 filesystems to run parallely. gfs2_jadd umounts gfs2 and mounts gfs2meta to do its thing. Removed test mode. little-endian to big-endian change on disk-hash. 2006-07-20 Robert Peterson Service stop was killing daemons, which hung system at umount time. Also, service status would hang when cluster was down. 2006-07-20 David Teigland if mount.gfs is unmounting/leaving the group because the kernel mount failed, then don't wait for the kernel mount to complete before doing the leave when a kernel mount fails and we leave the mountgroup, we need to pass an error value with the unmount/leave so gfs_controld will know to not wait for the kernel mount to complete before doing the leave use cmanincdir when building gnbd uncomment bullpap and ipmilan add copytobin for rsa & rsb (we should probably add explanation for why certain agents are commented out) 2006-07-20 davidlee Pull some residual 'PROG's into line with the others, to use full pathname. ('configure.in' had set a local PATH to find them; thus found, this information needs to make its way into the Makefiles.) CVS patchset: 10109 CVS date: 2006/07/20 14:40:12 --HG-- extra : convert_revision : 682a98501242fe988dc20b81693c7abca1e20d02 2006-07-19 David Teigland needed _safe version of list_for_each_entry when moving entries 2006-07-19 Lon Hohberger Fix typo in Makefile Add preliminary live-migration support (e.g. for Xen for FC6 2006-07-19 David Teigland some trailing )'s were left out set cmanlibdir for group do distclean do distclean in group/ Use system includes instead of including from configured kernel_src. (People building the tree on old distros may need to copy some headers into /usr/include/linux/) 2006-07-19 Robert Peterson Patch from Fabio Di Nitto: Make sure to clean up *.d files and remove gfs_fsck binary in make clean. Remove iddev from configure script. 2006-07-19 Patrick Caulfield for RRP use "active" rather than "passive" on Steve's advice. 2006-07-18 Robert Peterson Rename req_lock to gfs_req_lock to avoid duplicate symbols. Add /proc/fs/gfs support back in. 2006-07-18 msoffen Fixing inet down command. CVS patchset: 10106 CVS date: 2006/07/18 17:06:28 --HG-- extra : convert_revision : 8457fbd6f879f2cf312da7820100f91add34213b 2006-07-18 andrew Correctly observe --disable-snmp-subagent --- configure.in | 5 ++++- 1 files changed, 4 insertions(+), 1 deletion(-) CVS patchset: 10102 CVS date: 2006/07/18 06:20:32 --HG-- extra : convert_revision : dc782f84eb0d63f5f0a056bac685406c692a5e3f 2006-07-18 Robert Peterson Accomodate changes Steve Whitehouse made to gfs2's dinode structure. 2006-07-17 Ryan O'Hara Added support for SELinux extended attribute types. 2006-07-17 David Teigland fix up debug logging 2006-07-17 Ryan O'Hara Remove extra argument from log_debug call. 2006-07-17 davidlee Allow CMD to be a chain of commands. Use an instance of this to work around a Solaris 10 OS bug. CVS patchset: 10092 CVS date: 2006/07/17 17:00:05 --HG-- extra : convert_revision : 5808980c10bea19de105d3cc2020380a9bd40455 Catch 'declaration-after-statement' compile-time non-portabilities. CVS patchset: 10089 CVS date: 2006/07/17 11:24:01 --HG-- extra : convert_revision : c37eeb0adf971e2ba69923cbb71d306874273f31 2006-07-14 David Teigland node A may get a start cb and send a started message, and node B may receive the started message before it gets its own start cb; node B shouldn't ignore the started message from A. 2006-07-14 alan Small fix from Serge Dubrouski for one annoying problem when PostgreSQL isn't installed on a box and one tries to run the script. CVS patchset: 10087 CVS date: 2006/07/14 20:38:15 --HG-- extra : convert_revision : bd349c0d6dc04fa69219639d7cae79b9a942d2fc 2006-07-14 Robert Peterson Fix divide by zero because superblock constants were not correctly set. Split read_sb into read_sb and compute_constants like libgfs2. That enables programs that do not read the superblock (like mkfs) to get the constants they need. 2006-07-14 David Teigland add option to dump debug messages from gfs_controld using 'group_tool dump gfs', 'group_tool dump' still dumps debug messages from groupd. keep a 1MB circular buffer of debug messages, they can be dumped out by running 'group_tool dump gfs' 2006-07-14 Robert Peterson This is a bug fix for bz 164499. It allows loopback-mounted files within a gfs file system. 2006-07-14 David Teigland remove duplicate line add libgfs 2006-07-14 Robert Peterson A printf to stdout was getting redirected to the daemon's socket causing the daemon to log strange errors and mount hangs. 2006-07-13 David Teigland - memset to 0 arrays of arg pointers - tighten up the splitting of strings into arg arrays - reduce the size of the arg pointer arrays since we now know the max number of args we're splitting out 2006-07-13 Robert Peterson Remove gulm, dlm and nolock from Makefile. gfs1 will now use the dlm and nolock from gfs2. The gulm locking protocol is going away. 2006-07-13 David Teigland debug print of the full uevent string from the kernel fix dump len so we don't complain fix up group_tool dump which was broken 2006-07-13 alan Put in PostgreSQL resource agent from Serge Dubrouski. CVS patchset: 10083 CVS date: 2006/07/13 16:25:32 --HG-- extra : convert_revision : 769a49da43af2cd062f7967a159b409f6c40d622 2006-07-13 David Teigland no more dlm_device module 2006-07-13 Ryan O'Hara Moved from fence/scripts directory. Move scsi_reserve init script to fencing agent directory. 2006-07-12 Ryan O'Hara fence_scsi agent should use "self" rather than try to determine node node. 2006-07-12 Stanko Kupcevic "clumon moved under Conga project" message 2006-07-12 Lon Hohberger *** empty log message *** 2006-07-12 Ryan O'Hara Added success and failure commands in start/stop. scsi_reserve start - success if we can register with a device. scsi_reserve stop - success if wa can unregister with a device. Note that we always try to create a reservation for a device in start, but we do not care about success/failure. If it fails, it is probably because a reservation already exists. 2006-07-12 Lon Hohberger Fix licensing information in resources Add missing xenvm.sh resource Fix #198406 - lack of ipv6 support in clufindhostname.c Patch for in-tree builds from Fabio M. Di Nitto. Fix missing/non-updated #includes 2006-07-12 Patrick Caulfield Don't lose the end of a lock name 2006-07-12 Lon Hohberger - Make rgmanager actually do things. - Finish port of rgmanager to CMAN messaging. - Add feature to wait for nodes to be fenced prior to handling a node-down event. - Add direct DLM lock support. - Fix local communication. - Optimize VF data distribution algorithm to use CMAN/Totem's broadcast mode; this should make rgmanager much more scalable. - Add multiplexing for CMAN communications so threads can have pseudo private channels over the One CMAN socket. - Add service->service dependencies based on service events. - Add node ID display to clustat text-mode output. 2006-07-11 Robert Peterson Changes necessary due to removal of iddev parts (replaced by libvolume_id) Reverse previous decision on locking.c. Removed unwanted sysfs groups. Removed some debug code. 2006-07-11 Ryan O'Hara Updated copywrite and fixed title. 2006-07-11 Robert Peterson Re-add locking.c with its redundant gfs_mount_lockproto and gfs_unmount_lockproto routines because calling the gfs2 equivalents (and the externalizing of them) would probably not be acceptable to the upstream community. 2006-07-11 Patrick Caulfield Don't copy the agent name if it's NULL. Make the requred size of the agent string clear. 2006-07-11 Robert Peterson Changes necessary due to removal of iddev parts (replaced by libvolume_id) Changes necessary due to removal of iddev parts (replaced by libvolume_id) Also incorporated libgfs for common functions. Changes necessary due to removal of iddev parts (replaced by libvolume_id). Also incorporated libgfs for common functions. New ncurses-based gfs_edit synced from RHEL4 and STABLE branches. Change gfs to work with new locking infrastructure. Also, changes crosswritten from STABLE, and removed undesired debug code. 2006-07-11 Abhijith Das Removed reference to lock_gulm from the script. Works fine as is. Initial commit of gfs2_jadd. Doesn't work fully. Needs to be tested with GFS2 filesystem. 2006-07-11 Ryan O'Hara Initial version of the fence_scsi man page. 2006-07-11 Abhijith Das gfs2 init script. Minor changes. Works just fine 2006-07-10 alan Clarified certain parameter error conditions with better messages. Also fixed the bug where it called findif all the time. CVS patchset: 10068 CVS date: 2006/07/10 20:08:16 --HG-- extra : convert_revision : e6d27abe8352ecd564765a9fb3971b0ae1eaac74 2006-07-10 Ryan O'Hara Name of node to be fenced is passed via "nodename=" parameter. 2006-07-10 Chris Feist Added -lpthread to LDFLAGS to fix bz #198187 (Unresolved symbols w/ ldd -r) 2006-07-10 Ryan O'Hara Added "self" parameter to dispatch_fence_agent. Added "-s" option as way to pass name of current node. This is needed for SCSI persistent reservation (fence_scsi). Added "self" parament to dispatch_fence_agent. Needed for SCSI persistent reservation (fence_scsi). Added "self" parameter as a way to pass our_name to the agent. This is needed for SCSI persistent reservation (fence_scsi). 2006-07-10 David Teigland set DESTDIR when installing openais 2006-07-10 davidlee fix bug 1364 (cont.): ensure we cover all 'if ...' bases CVS patchset: 10066 CVS date: 2006/07/10 14:25:32 --HG-- extra : convert_revision : 42a641bad488931b1505d5360d290d83357c7997 fix bug 1364: should restrict rpm-based commands to rpm-based systems CVS patchset: 10065 CVS date: 2006/07/10 14:20:43 --HG-- extra : convert_revision : e3fe8aa9d65a16eaf2808c6152ea7cd0fc036609 2006-07-07 David Teigland dispatch_fence_agent() was prototyped and called with an extra arg that doesn't exist in the real function 2006-07-07 Chris Feist Fixed building for x86_64. Makefile fixes to assist with rpm building. 2006-07-07 David Teigland complain and ignore a cpg confchg reason we don't understand 2006-07-07 Patrick Caulfield Make sure we ${libexecdir}/lcrso - packagers need it. Make SBINDIR default to /usr/sbin so we can find aisexec 2006-07-06 Ryan O'Hara Fix stdin parameter parsing to handle 'name=value' correctly. Added extra output when verbose option is given. Fixed code to close stdin, stdout, stderr after open3 call. Added parameters for chkconfig. Script will be started for levels 3, 4, and 5. Start priority is 25. Stop priority is 75. 2006-07-04 lars The IPaddr2 agent was missing the conditional special handling for removing & restoring IP addresses on loopback, for the rare case of being used with LVS DR. But, this breaks BSC, which tries to use an IP on lo as a regular test. CVS patchset: 9998 CVS date: 2006/07/04 14:59:48 --HG-- extra : convert_revision : ff32fcf9bec2d8ec40686e69a97192088d103e68 2006-07-03 Patrick Caulfield Don't force unwanted flags on people. honour DEBUG=y Count votes correctly - buy not shadowing variables, sigh. Also fail to start if the nodeid is not set. gah! forgot to remove the /cman off the end of SBINDIR Run aisexec from SBINDIR 2006-07-03 sunjd bug1055: tweak package dependencies CVS patchset: 9979 CVS date: 2006/07/03 05:47:08 --HG-- extra : convert_revision : 226bc62c7d357632e7523d127a9d65b06159d3ea 2006-06-30 David Teigland put back old check that previous commit avoided steps to download/build/install openais and libvolume_id tarballs build against installed cman lib and header build against installed headers and libs for cman and openais to be consistent, instead of "libcman.h" - build against installed openais/cman headers and libs - if a cman node fails that isn't in the groupd cpg, don't wait on a cpg update for it gfs_controld_connect error values are < 0, not 0 2006-06-30 Patrick Caulfield Build using installed openais 2006-06-30 James Parsons remove this file in preference for the version with filetype extension, like other agents. The Makefile generates the version for the sbin dir without extension. 2006-06-29 James Parsons Added new fence_scsi agent support Added support for new fence_scsi agent Makefile and cool new scsi agent renamed to match convention. 2006-06-29 Ryan O'Hara Fix perl cmd declaration that caused sg_persist to fail. 2006-06-29 Abhijith Das modprobing lock_dlm before starting gfs_controld, removed init.d make targets for ccs and fence 2006-06-29 Ryan O'Hara Initial check-in of SCSI persistent reservation init script. Initial check-in of SCSI persistent reservation fence agent. 2006-06-29 David Teigland - extra checking and debugging when events get backlogged - prevent joins while we're still leaving and leaves while we're still joining 2006-06-28 Abhijith Das Removed ccsd and fenced init scripts. Their functionality is replaced by the cman init script Single init script to start up cluster: Covers loading of modules, starting ccsd, cman and fencing, and starting daemons. Replaces ccsd, cman and fenced init scripts 2006-06-28 David Teigland fix compiler warnings fix makefile From: fabbione@ubuntu.com (Fabio M. Di Nitto) Fix install From: fabbione@ubuntu.com (Fabio M. Di Nitto) 2006-06-28 Jonathan Brassow - cmirror is not ready to compile in HEAD 2006-06-28 Robert Peterson Include man pages for convert, fsck, etc., in Makefile. Clean up .d files on Make clean rather than distclean 2006-06-27 Robert Peterson Switch to libvolume_id method of determining pre-existing file systems. 2006-06-27 lars crm_mon is installed into sbindir, no longer ha_libdir. CVS patchset: 9972 CVS date: 2006/06/27 17:03:07 --HG-- extra : convert_revision : b617fdc388351a60e1aa178079d155213d296cc7 2006-06-27 Patrick Caulfield Fix build error. Thanks Lon. 2006-06-26 David Teigland posix_test_lock() args updated for 2.6.17 2006-06-26 Jonathan Brassow - filling out client side logging implementation (patches sent previously) Work remaining: 1) client (kernel) side netlink implementation 2) server implementation 2006-06-26 Benjamin Marzinski fixing dm-multipath support for GNBD libsysfs is deprecated. Stop using it. 2006-06-26 sunjd bug1055: add packages to the rpm package dependency CVS patchset: 9969 CVS date: 2006/06/26 16:18:53 --HG-- extra : convert_revision : 5007b437bf84ea13d3ced62e4fbd5e9abe2740c5 2006-06-26 panjiam only start/stop operations need root permission, bug#1230 CVS patchset: 9966 CVS date: 2006/06/26 05:46:44 --HG-- extra : convert_revision : f5f56c4d53bb4374df4d0ea5c8055b5f73e91c48 2006-06-23 Lon Hohberger Fix includes for build on ia64 Implements 'label' support for qdisk. Uses /proc/partitions for device info & scans devices for signatures. Useful in environments where a device is present but maybe numbered differently on different nodes depending on the host/SAN configuration. Also adds initialization utility which must be run before qdiskd will use a given partition. 2006-06-23 andrew Memory rounding fixes CVS patchset: 9957 CVS date: 2006/06/23 12:31:28 --HG-- extra : convert_revision : 1623da0fae63fdca991ee6df04798971cc6eb9bc 2006-06-23 David Teigland retry cpg_join and cpg_leave if error is TRY_AGAIN 2006-06-22 David Teigland don't process new join/leave events without quorum remove debug printf now that we copy out app member list for viewing, set the member count to that total instead of cpg member list total improvements to debug messages 2006-06-22 lars On a failed mount, cleanup OCFS2. (This would happen by the "stop" which would be issued a bit later anyway, but this is slightly cleaner.) Move helper functions into a common block. CVS patchset: 9943 CVS date: 2006/06/22 18:13:53 --HG-- extra : convert_revision : e9ebfd146123ec2cdfe822383f5bb3f8be697f8e 2006-06-22 Patrick Caulfield Update to OpenAIS with patch for CPG alignment bug 2006-06-22 panjiam use ha_pseudo_resource() instead of the built-in one CVS patchset: 9920 CVS date: 2006/06/22 02:47:27 --HG-- extra : convert_revision : fa579867c943c60c40d3f5b612cba4667a0fd5a6 2006-06-21 David Teigland add standard script need to include ../make/defines.mk to get {sbindir} definition this added -Wall which I didn't notice was missing before, so this uncovered a bunch of warnings that are now fixed 2006-06-21 lars Logging changes, less noisy on monitor ops now for OCFS2. CVS patchset: 9918 CVS date: 2006/06/21 20:43:00 --HG-- extra : convert_revision : 58acab9d0221a8cb9303683ad62213bcc8e8d9cc Large overhaul of Filesystem_stop. Checkpoint commit. Addresses: - Filesystem_status was using /proc/mounts, which is unportable. Encapsulate into list_mounts() function which takes advantage of this if present, and calls mount otherwise. - Unify handling of nested mounts with the handling of the mount itself. - Never call fuser if the attempt to umount succeeded, but return success directly. - If it didn't, send signals, sleep a bit, retry. - Use umount -f for network filesystems. - No need to call umount or fuser differently for block device mounts or others; using the mountpoint always works. - Reindent Filesystem_stop() be at least consistent with _some_ other parts of the script. ;-) - Split out OCFS2 specific handling in stop code path. CVS patchset: 9917 CVS date: 2006/06/21 19:04:03 --HG-- extra : convert_revision : cf8dcac35e3372c0c659d22890c8899144ca133b 2006-06-21 David Teigland Don't finalize/terminate a local group leave until we see that all remaining group members have stopped. 2006-06-21 Patrick Caulfield Pull latest openAIS 2006-06-21 Jonathan Brassow - This is the beginning of the cluster mirror log rewrite. The purpose is to work with the new CMAN/OpenAIS framework. The server moves to user-space. Will post description and RFC to cluster-devel. 2006-06-21 zhenh Added some Oracle resource agents - due to Dejan Muhamedagic. CVS patchset: 9898 CVS date: 2006/06/21 00:48:56 --HG-- extra : convert_revision : e69a30d53dde7bed4e9d80ad32623324a7a332e0 2006-06-20 alan Added some Oracle resource agents - due to Dejan Muhamedagic. CVS patchset: 9897 CVS date: 2006/06/20 21:03:51 --HG-- extra : convert_revision : 520ce46962ad2312d6ee1a560137562a0b94fe9f 2006-06-20 David Teigland openlog("groupd", LOG_PID, LOG_DAEMON) for syslog entries - sort out which messages should be log_debug/log_group vs log_print/log_error - put log_print/log_error messages in syslog 2006-06-20 Robert Peterson Fixed bugs regarding acls and eattrs. Also crosswrote some fixes from gfs1 regarding eattrs. 2006-06-20 David Teigland don't skip fencing a node unless it's both a cman member and has fully started groupd - keep cman member list updated by using cman callbacks instead of polling cman for the latest list every time we're interested - only bypass fencing of a node if it's both a cman member and in the groupd cpg (has started groupd past the point of checking for residual gfs/dlm state) Moving the cluster infrastructure to userland introduced a new problem that we didn't need to worry about before. All cluster state now exists in userland processes which can go away and then come back like new, i.e. unaware of the previous state. Typically, when the userland cluster infrastructure on a node "disappears", another node recognizes this as a node failure and recovers. There's no problem there. The problem is when the cluster infrastructure disappears on all the cluster nodes and then comes back. The infrastructure that disappeared may have abandoned control of gfs/dlm instances in the kernel. When the infrastructure comes back, it's like a brand new cluster, it knows nothing about the residual, uncontrolled instances of gfs/dlm. New nodes would use gfs/dlm in this new cluster independently of the unknown gfs/dlm users from before and there'd be immediate corruption [1]. Eventually, the infrastructure may be able to reconstruct the global state of abandoned instances of gfs/dlm when it comes back and reassert control of them, but that's not realistic any time soon. For now, the infrastructure needs to recognize nodes with residual gfs/dlm state as failed nodes that need recovery (starting with fencing). That recognition and recovery now happens as part of the startup initialization, before new instances of gfs/dlm are created [2]. [1] This is trivial to demonstrate: - start up a cluster on nodes A,B,C - mount gfs on nodes A,B - run 'cman_tool leave force' on A,B,C - start up the cluster again on A,B,C - mount gfs on node C - nodes A,B are now using gfs independently of node C [2] The previous example now works like this: - start up a cluster on nodes A,B,C - mount gfs on nodes A,B - run 'cman_tool leave force' on A,B,C - start up the cluster again on A,B,C i) when groupd starts on A,B, it recognizes the uncontrolled instance of gfs, kills cman locally and fences the local node [3] ii) when C runs fence_tool join, a new fence domain is started which fences nodes with an unknown state, which are A and B - mount gfs on node C [3] This self-fencing does nothing for node C which still needs to fence both A and B itself. If A fences itself before C fences it, A will be fenced twice. This self-fencing step is optional, but it can be convenient when 1) all the nodes restarting the infrastructure find residual gfs/dlm instances and 2) reboot fencing is used. The anticipated situation is one where everyone has residual state so no one can start up to fence anyone else; all are stuck. But, they all fence themselves, reboot and resolve everything. There's a different approach we could take that would be more convenient when not all cluster nodes are likely to be mounting gfs or SAN fencing is used. In this case, a node that finds residual gfs/dlm instances would remain a cman member and not fence itself. This would contribute quorum to help another node without residual state start up and fence it. The solution to this confusing situation is simple: - groupd now checks for residual gfs/dlm kernel state when it starts up and if it finds any it kills cman and exec's fence_node . - fenced can't bypass fencing of a node unless the node is both a cman member and has fully started groupd (a node may need fencing if it's joined the cluster but groupd isn't starting). - the same consideration in fence_manual as fenced 2006-06-20 Patrick Caulfield Make it clear that admin sockets can't receive callbacks either. 2006-06-19 Robert Peterson Fixed problems printing stuffed directories like master and jindex. Also enhanced the jump 'j' command capabilities to jump based on highlighted directory entries. Also made it remember display mode and highlighted entry when jumping from structure to structure. Converted file systems had no journals. Changes to lock protocol were not saved. Also removed some vestigial stubs from libgfs2.h. 2006-06-19 Patrick Caulfield Add missing include. 2006-06-17 Wendy Cheng Sync with base kernel data structure changes: 1. i_sem (in struct inode) is replaced by i_mutex. 2. s_old_blocksize (in struct super_block) no longer exists. Thank to Mathieu Avila pointed this out. 2006-06-16 Patrick Caulfield Add include so we get a prototype for syscall() 2006-06-15 David Teigland remaining withdraw bits now in recover.c Complete the code to support withdraw, not yet tested. This also switches from using dlm locks for withdraw notifications to simply using messages. The way the daemon now works allows a much simpler approach to withdraw than what we had before where we needed the dlm locks. Setting up a dlm lockspace for the daemon was also an annoyingly heavy-weight step and the dlm kernel state of the daemon made cleaning up from crashes difficult. 2006-06-15 Robert Peterson Fixed a bug where changes to the root inode are not written to disk. Also, added logging changes for bz 156009. Added some error reporting back in when checking for gfs2 file systems. 2006-06-15 Abhijith Das Edited Makefile to look more like other makefiles, dependent on objects rather than sources, etc. Removed reference to asm/page.h (for PAGE_SIZE) from util.c. Wasn't compiling on x86_64. Instead made PAGE_SIZE a #define 2006-06-15 David Teigland Significant reworking of how mounts are processed. The previous approach couldn't deal with certain node failures that occured while processing a new mounter. In this new approach, processing a mounter is largely independent of processing node failures. Nodes failing while processing a mounter hasn't actually been tested yet, so there are sure to be details to fix. 2006-06-15 Patrick Caulfield Build against installed headers rather the ./configured kernel source. 2006-06-14 David Teigland for group_tool query, fill in members list from app perspective, not cpg perspective 2006-06-14 Chris Feist Fix configure script so we don't try to pass ccsincdir & ccslib dir to rgmanager. 2006-06-14 David Teigland change log_error to log_debug for non-error update some of the build steps 2006-06-14 Robert Peterson This addresses bugzilla bug #156009 - gfs fsck needs a good review of logging. I improved logging and added some "warm fuzzy" feel-good messages to let the users know it's not hung. 2006-06-14 David Teigland don't syslog non-errors 2006-06-14 Robert Peterson Remove obsolete references to unlinked_tag. 2006-06-14 lars Patch from Dejan: NFS can be mounted several times too, so allow it to run as a clone. CVS patchset: 9848 CVS date: 2006/06/14 13:48:20 --HG-- extra : convert_revision : ba6a8fa3844c19c758edff965b37366d23faf84b 2006-06-13 Lon Hohberger Include missing .c files in src/clulib; remove defunct src/daemons/members.c Patch from Fabio Massimo Di Nitto: Fix includes 2006-06-13 andrew Patch from Dejan - better defaults - improved use of "su" - option passing CVS patchset: 9833 CVS date: 2006/06/13 06:50:21 --HG-- extra : convert_revision : 79ef0e06d650e28219f99c245a4d9e76ff6f7b2b 2006-06-13 Robert Peterson Improvements to Makefile, renamed gfs2_mkfs man page to mkfs.gfs2.8 2006-06-13 Chris Feist Not necessary to specify /sbin. Install files in the correct location. 2006-06-13 Robert Peterson Fix typo in Makefile 2006-06-12 Chris Feist Use ccslibdir instead of libdir to find ccs libraries. Assign ccslibdir to the appropriate variable in the configure script. 2006-06-12 Robert Peterson Add gfs2_fsck to Makefile 2006-06-12 Chris Feist Remove references to magmalibdir & magmaincdir. 2006-06-12 Robert Peterson Change copyright to 2006 Fixed a bug when printing stuffed directories. For example, gfs2_edit -p masterdir was not working properly. Made all block numbers show in decimal and hex. Minor bug fixes. Moved some functions to libgfs2. Got rid of dependency on libgfs. Every is now done with libgfs2 functions. Also did some cleanup and bug fixes. Moved functions from fsck to libgfs2, added functions to make gfs2_convert not use libgfs. Added gfs2_fsck, added gfs2_convert, renamed gfs2_mkfs to mkfs.gfs2 2006-06-12 Alasdair G. Kergon test checkin test checkin 2006-06-12 David Teigland run configure in group/ remove lock_dlm/ files, moved to group/gfs_controld/ remove daemon/ files, now in group/dlm_controld remove dlm_tool/ files, not used 2006-06-12 Chris Feist Initscripts should be installed in /etc/rc.d/init.d Fixed install script to install appropriate binaries. 2006-06-12 alan Removed the error/info messages for incorrect parameters. CVS patchset: 9828 CVS date: 2006/06/12 15:31:25 --HG-- extra : convert_revision : e94baba13128996c7a1bf4c76e6b0fed40a5349c 2006-06-12 Patrick Caulfield Remove dlm32 as all the 32/64 conversions are done in the kernel now. You need the latest GIT kernel for this to work... 2006-06-10 alan Changed some errors to warnings... CVS patchset: 9818 CVS date: 2006/06/10 19:10:20 --HG-- extra : convert_revision : 4ba1c52bbac997ff96734598a7ca8c758d2ef6db Fixed a stupid error I put in, plus made some unnecessary messages go away... CVS patchset: 9817 CVS date: 2006/06/10 17:33:30 --HG-- extra : convert_revision : aea117dd4710c885d3afa4a75ff14ebd0458c542 Put in some code to detect missing directories, and also to drop a trailing "64" from libdir and libexecdir if the directory doesn't exist. CVS patchset: 9816 CVS date: 2006/06/10 17:04:32 --HG-- extra : convert_revision : a1790a651d2b3c3483be90f87846f63a12a1e978 2006-06-10 Chris Feist - We don't use the copytobin target in libgfs2 - Added the copytobin target in convert and mkfs - Disabled copytobin target in quota 2006-06-09 Chris Feist Modified makefile to use instead of hardcoded directory. 2006-06-09 David Teigland move dlm/daemon/ to group/dlm_controld/ move gfs/lock_dlm/daemon/ to group/gfs_controld/ 2006-06-09 Chris Feist Temporarily disable quota build until functioning w/ gfs2. Updated include to properly find gfs_ondisk.h. 2006-06-09 Abhijith Das Removed usage of IFLAG_INHERITDIRECTIO and IFLAG_INHERITJDATA flags, because they were removed from linux/iflags.h 2006-06-09 alan Updated the name of netmask to cidr_netmask It still accepts the old value. CVS patchset: 9813 CVS date: 2006/06/09 19:01:32 --HG-- extra : convert_revision : 49345750d4f8df74c0e389cdd9b4651001b3a822 2006-06-09 David Teigland there is no dlm_tool any more - get NETLINK_KOBJECT_UEVENT definition from kernel's netlink.h - needed to include "netlink.h" instead of to get the most recent one - get kernel types like __be64 and __u64 (used in gfs2_ondisk.h) by including linux/types.h - have makefile -I../../gfs-kernel/src/gfs and source file include "gfs_ondisk.h" 2006-06-08 Robert Peterson Misc bug fixes. For example, it was not updating block free counts for fsck. Missed on initial check-in. 2006-06-08 msoffen Adding SysInfo and VIPArip CVS patchset: 9800 CVS date: 2006/06/08 19:16:15 --HG-- extra : convert_revision : 0445e62598c027b8c30d4fc0147386aab15f5ae0 2006-06-08 David Teigland Set up a separate cpg for sending messages (e.g. for processing mount/unmount) instead of sending them through the cpg used to represent the mount group. Since we apply cpg changes to the mount group async, that cpg won't always contain all the nodes we need to process the mount/unmount. A mount from one node in parallel with unmount from another often won't work without this. Also fix the makefile to include headers from the kernel_src location. 2006-06-07 andrew Split the PE into three distinct parts: - a library for processing rules - a library for determining the current cluster status - and a library for calculating the next cluster state Most tools (such as crm_mon) will only need to use the 2nd library The first library can be used by the crmd to parse its options (without needing the rest of the PE) CVS patchset: 9788 CVS date: 2006/06/07 12:46:55 --HG-- extra : convert_revision : f364430e8a053f138be39217113a08a33a682926 2006-06-06 Robert Peterson Remove obsolete parts now in libgfs2. First working version that uses libgfs2. May still have problems. Fix several bugs and add changes necessary to match libgfs2. Ability to print directory details, minor libgfs2 changes. Prep for libvolume_id, minor changes due to libgfs2 changes. Several updates and bug fixes, mainly for gfs2_fsck. 2006-06-06 Patrick Caulfield We need to build libccs -fPIC because it's included by cman which is a shared object. 2006-06-05 Robert Peterson Added refs to libgfs2, gfs2_convert, and gfs2_fsck. 2006-06-02 Benjamin Marzinski There. that should work better changed from MODULE_PARM to module_param 2006-06-02 Lon Hohberger Start of ripping out dependency on magma; builds but does not currently work 2006-06-02 andrew Small logging patch from Dejan CVS patchset: 9757 CVS date: 2006/06/02 06:33:16 --HG-- extra : convert_revision : 942cc46ebe81caad02d92d2187db9e24cd87300b 2006-06-02 Benjamin Marzinski more Makefile fixing updated man pages with UID information 2006-06-01 andrew Missing RA CVS patchset: 9755 CVS date: 2006/06/01 16:25:02 --HG-- extra : convert_revision : afef454b83768731fccb7447ffa8da2828781f14 2006-06-01 lars The then which is double is not the true then. Ommm. CVS patchset: 9749 CVS date: 2006/06/01 13:40:30 --HG-- extra : convert_revision : 9485121878a17af4d03bde27cd0b9a581819c127 2006-06-01 andrew Use the automatic section name CVS patchset: 9748 CVS date: 2006/06/01 12:42:23 --HG-- extra : convert_revision : dd3464fb3a53bf3206ee15204d0017ccfc05f538 Fix for multi-word values CVS patchset: 9747 CVS date: 2006/06/01 12:29:00 --HG-- extra : convert_revision : e1439a7d740ff991a2d85cc38a432f924ebd3c83 New RA for populating node attributes CVS patchset: 9741 CVS date: 2006/06/01 08:44:40 --HG-- extra : convert_revision : 1fe57da9e216cf8871edd8e1fd1589fa23ec5b6e 2006-06-01 lars Novell #180303: Filesystem returned 7 instead of 0 for an already unmounted filesystem. (Shell variable scoping: Both Filesystem_stop and _status used "rc" to track their return code and it was thus overwritten inside _stop :-/) CVS patchset: 9738 CVS date: 2006/05/31 22:08:10 --HG-- extra : convert_revision : f06dfa6fceeccb040cedb5f05115757927e4d3ee 2006-05-31 Chris Feist Added ccsincdir and ccslibdir to facilitate building. 2006-05-30 lars Addition to zhenh's commit from Tue, 30 May 2006 00:37:17 -0600 (MDT): If it's going to be specified on the commandline anyway, this logic is superfluous (as it never was invoked because of libdir never being empty, anyway). CVS patchset: 9725 CVS date: 2006/05/30 12:24:36 --HG-- extra : convert_revision : 5daffbf4fb218c789452ec982f743d488c371877 2006-05-30 Patrick Caulfield Use new OpenAIS 2006-05-30 zhenh avoid report error when getting metadata CVS patchset: 9716 CVS date: 2006/05/30 07:34:44 --HG-- extra : convert_revision : 6e2ae28af89a7e2a374b1ada3bafe121c6291994 rollback the patch for x86-64. we can use --libdir=/usr/lib64 to build on x86-64 platform CVS patchset: 9715 CVS date: 2006/05/30 06:37:17 --HG-- extra : convert_revision : f3b0d6c91aede60924e16eb73bf6d88ed692b273 2006-05-28 lars Handle the case where no valid OCFS2 fs is found on the device. CVS patchset: 9688 CVS date: 2006/05/28 20:27:22 --HG-- extra : convert_revision : 09b2104c258ee780f2ff69903cb1ee6fcdbc3104 2006-05-28 zhenh add the script for setting the weight and site of node CVS patchset: 9678 CVS date: 2006/05/28 00:45:07 --HG-- extra : convert_revision : 53fdfbaa4d5bccd67a31338ad7ea54b659f62c41 2006-05-26 David Teigland when finishing an unmount for a node, we shouldn't be checking it's recovery status as if it had failed (instead of unmounting) 2006-05-26 zhenh add VIPArip (Virtual IP Address by rip2) CVS patchset: 9667 CVS date: 2006/05/26 01:32:14 --HG-- extra : convert_revision : 9badf860a7eecf72452ce83bf00dec270215bf27 1.fix the libdir for 64bits. 2.add VIPArip (Virtual IP Address by rip2) CVS patchset: 9666 CVS date: 2006/05/26 01:30:50 --HG-- extra : convert_revision : de0b743fa0a6fdda36f1db04bd541fd264d34ad4 2006-05-25 lars - In post-notify for a stop, the "start" list should be empty and ignored. - Ignore post-stop and pre-start. - Enhance debug logging. CVS patchset: 9663 CVS date: 2006/05/25 17:15:56 --HG-- extra : convert_revision : cf3287027d12b6b7bfd8b5236084ee92f38360c6 2006-05-25 David Teigland when unrecovered nodes are set to be recovered again after sequential node failures, we need to exclude nodes that are unmounting and don't need any recovery - re-enable the code that waits for the kernel mount in the finish() for a mountgroup join. should work better now with the recent lock_dlm kernel change - we were missing a clear_new() call in the case of a first mounter delaying its start_done() for a second mounter - we need to discard group messages on a node that's been added to the cpg group but not yet added (asynchronously) to the app group check that the specified mount type matches the actual fs type on disk (magic number alone no longer does this since 1&2 have same magic #) retry cpg_join() when it returns ERR_TRY_AGAIN 2006-05-25 Patrick Caulfield Don't call a parameter 'time' as it confuses some compilers. 2006-05-24 Patrick Caulfield Don't segv if the node has no "votes" property. bz#192136 Align some message structures better Do the big-endian conversion so it actually works. Add missing swab_header(). 2006-05-24 zhenh when running the configure, the build directory may be empty, so we need create the directory tools for ccdv which will be generated in next line CVS patchset: 9637 CVS date: 2006/05/24 06:48:24 --HG-- extra : convert_revision : 8686ed4e9253fbe120b840aad56fcf0cea282eb8 2006-05-23 andrew Tips for new players... 1) Dont use reserved autofoo words for your new configure options 2) Dont put two calls to AM_INIT_AUTOMAKE() in configure.in EVEN IF ONLY ONE WOULD GET CALLED. People found doing either of the above should purchase a large sword for their self defence. CVS patchset: 9636 CVS date: 2006/05/23 20:36:29 --HG-- extra : convert_revision : f22a791139cd188b925e18564cefb28fa3bd44f9 Fix for compilation on suse10.1 (No idea why lmb hasn't run into this problem as well) CVS patchset: 9635 CVS date: 2006/05/23 18:58:26 --HG-- extra : convert_revision : 1d224a9495f8624091b0ac080471d8aba7a56b6f Tweak pretty_cc to work on suse10.1 CVS patchset: 9634 CVS date: 2006/05/23 18:50:47 --HG-- extra : convert_revision : ed36f1772c1d7a55b828e9f4228289c1b7b8ad99 2006-05-23 lars Perform the mount for OCFS2 at start-time, now that the notify data is available with the actual operation too, and process pre-notifications for start (Novell #177525). CVS patchset: 9633 CVS date: 2006/05/23 16:48:56 --HG-- extra : convert_revision : 687f1e066af127b7fef5ca02ec694c8fbe3d8328 2006-05-22 Lon Hohberger Fix build on 64-bit arches 2006-05-22 andrew Use the new meta namespace for CRM generated "parameters" NOTE: Neither the Filesystem (when cloned) nor the (master/slave) drbd agent functions correctly in any released version of Heartbeat - so these changes do not break backwards compatibility. CVS patchset: 9621 CVS date: 2006/05/22 11:03:41 --HG-- extra : convert_revision : 09112195f0d370993ac5adc84800a09e2b919751 2006-05-22 Patrick Caulfield Add support for OpenAIS RRP. altnames in ccs should now work 2006-05-20 lars The log message was wrong, cut&paste error... CVS patchset: 9605 CVS date: 2006/05/20 11:36:12 --HG-- extra : convert_revision : 53ad18a5be0edb5ea27572494031aacd0dc55a58 Do not overwrite libdir/libexecdir if explicitly set on the commandline. CVS patchset: 9604 CVS date: 2006/05/20 11:32:57 --HG-- extra : convert_revision : b7e2262c089b0c1de6dcdf0369119217e7650ad0 2006-05-19 Patrick Caulfield Allow loggin facility to be configured. 2006-05-19 Lon Hohberger Add simple scoring/disk-based quorum daemon 2006-05-19 lars I should know better than to write such unportable code. CVS patchset: 9598 CVS date: 2006/05/19 14:27:08 --HG-- extra : convert_revision : 36c3dd6756ec65a0dce3096a4724568089109ed5 Allow package version to be overriden in configure. Useful for playing pretend ;-) CVS patchset: 9597 CVS date: 2006/05/19 14:24:58 --HG-- extra : convert_revision : 1bce9b7f95d767038c50fd3e23e23362e5e76563 Errors should go to the bit bucket here. CVS patchset: 9596 CVS date: 2006/05/19 13:58:16 --HG-- extra : convert_revision : 67c797cc31289c2338b1d505917b9c392919c4dd 2006-05-19 Patrick Caulfield Fix some stats as reported by cman_tool status 2006-05-19 lars Further fixes from Jo, cleaned up by lmb. CVS patchset: 9595 CVS date: 2006/05/19 12:50:06 --HG-- extra : convert_revision : c19a8fe762a81deaab23559a5bc2a720694d160e 2006-05-18 lars Put in enhancements and fixes by Jo De Baer. CVS patchset: 9589 CVS date: 2006/05/18 15:28:35 --HG-- extra : convert_revision : 61fb37aa989e99e784d5fe1ed268edce621c425b Put in first version of OCFS2 related changes. CVS patchset: 9588 CVS date: 2006/05/18 15:27:11 --HG-- extra : convert_revision : b183b103835cfac365d2455107225992c7c4bd6b 2006-05-17 David Teigland only print debugging info to stderr if -D was used -- this may have caused problems when not using -D since debug messages would probably go down some other fd. Also log errors to syslog. decrement the node count before printing debug info so we see the new value Changes to the way recovery events are handled when we're already in a recovery event. Multiple recovery events are now merged most of the time. Also fixes to the way groups depend on other levels for ordering of recoveries. 2006-05-17 Abhijith Das bz 191222 : removed debug printk 2006-05-17 sunjd fix metadata CVS patchset: 9572 CVS date: 2006/05/17 02:04:32 --HG-- extra : convert_revision : 48b99f8b3c4428a643b6df76998223ca2f4ebbe3 2006-05-16 Robert Peterson Compile libgfs2 first. Removed obsolete gfs2_debug tool. Add convert tool and libgfs2. 2006-05-16 Benjamin Marzinski Make GNBD work with cman. 2006-05-16 James Parsons file powernet369.mib was initially added on branch RHEL4. file fence_apc_snmp.py was initially added on branch RHEL4. file fence_apc_snmp was initially added on branch RHEL4. file README_SNMP was initially added on branch RHEL4. file Makefile was initially added on branch RHEL4. 2006-05-16 Ryan McCabe fence_apc_snmp support 2006-05-16 Patrick Caulfield Don't busy-wait if a write fails. Make it background properly. Now there's no magma, there's no point in waiting for the cluster manager to connect, because ccsd needs to be started before the cman. Missing semicolon! oh dear. Don't let port number default to junk. 2006-05-16 Abhijith Das bz191222 fix. When releasing a glock with GL_NOCACHE flag set, care was not taken to ensure that only one holder for the glock remained. This was corrupting the glock and preventing further access to the glock. FLOCKS use this GL_NOCACHE flag. See bugzilla for more information. This patch needs to go through a test cycle to ensure that it doesn't affect other code 2006-05-15 David Teigland when sending the results of local journal recoveries, we weren't selecting just the successful recoveries and were potentially writing off the end of the message buffer 2006-05-15 Robert Peterson Initial checkin. Incorporate libgfs2. 2006-05-15 David Teigland print out the buffer for debugging when we can't parse it 2006-05-15 Robert Peterson Initial checkin of libgfs2 and related. Remove automake-related files and redundant includes. Got rid of automake and added minor fixes. Initial checkin of libgfs. 2006-05-15 Patrick Caulfield Fix key file override. Get rid of some redundant calls Be careful about returning junk as a node address. Tidy some of the inter-file communications. 2006-05-15 sunjd bug1254: can get metadata even httpd isnot found CVS patchset: 9551 CVS date: 2006/05/15 10:07:32 --HG-- extra : convert_revision : b60cf8fcd8bf7fab2f41f9324d2178ac6ca4a819 2006-05-13 David Teigland When a node is joining a group, has been added to the cpg, but is still queued to be added to the app, and then fails, we just purge its join event for the app and shouldn't queue a recovery event for the app since it doesn't exist there yet. Handle case where none of the current members of the mountgroup can recover journals (they're readonly) and there are journals that need recovery. The next rw mounter is told to recover all journals, and after it's done it needs to unblock access to the fs on the other readonly mounters. (This was mistakenly included in last checkin, it's not been tested yet.) 2006-05-12 David Teigland Comment out code that waits for kernel mount, it's not working in some cases. - id's where being removed from a recovery event's extended list too early, so multinode recovery processing was getting stuck. - when an extended recovery event aborts another event, the extended nodes in the rev weren't being marked stopped in the first event 2006-05-12 Patrick Caulfield Mention ccs_tool as a way of adding nodeids to a cluster.conf file. 2006-05-12 alan Upped the version number to 2.0.6 CVS patchset: 9535 CVS date: 2006/05/11 22:43:15 --HG-- extra : convert_revision : 1aacc077ab10172daa0a7cbeecc67bbf35b67b72 2006-05-11 Chris Feist Fixed fence so includes the proper directory for cman. Fixed rgmanager's clulib so it includes the proper ccs directory. 2006-05-11 David Teigland set stopped flag for rev->nodeid, not ev->nodeid 2006-05-11 alan Moved the detection and dealing with 64-bit libraries, etc. from ConfigureMe to configure This means that our .src.rpms will usable on either 32 or 64-bit platforms. Before, they weren't. CVS patchset: 9534 CVS date: 2006/05/11 18:40:18 --HG-- extra : convert_revision : 07f6ea676d963ed196d05dae668b19616880d59d 2006-05-11 David Teigland When more than one node failed at once, creating an extended recovery event, we weren't updating both recovery sets (for both nodes) when we finished processing the event. 2006-05-11 Robert Peterson New and improved gfs2_edit tool based on ncurses. This replaces both gfs2_edit and gfs2_debug. 2006-05-11 David Teigland Fix some possible problems with overlapping recoveries, and cases where we could confuse a pending event for the recovery event in the checks that control whether recovery processing should wait. 2006-05-11 alan Updates and a bug fix for the apache resource agent. CVS patchset: 9532 CVS date: 2006/05/11 14:56:42 --HG-- extra : convert_revision : 77b24c63f4d10ad7e4dc39fcc7c7688eea3a143c 2006-05-11 Patrick Caulfield Remove redundant kernel examples. Make userland examples compile cleanly. Return all multicast addresses. Also return a list of (cman) ports being listened on. Display both in cman_tool. Make the examples compile and do something. 2006-05-11 Ryan McCabe Ignore outlet groups 2006-05-10 Patrick Caulfield A rag bag of stuff: - Allow override keyfile on cmdline - more selective help (eg cman_tool join -h) - Show if fenced in cman_tool nodes - Show node addresses in can_tool nodes - If no multicast set then encode cluster ID in a multicast address to avoid clashes of clusters - add cman_get_node_addrs() to libcman (not fully implemented yet) - Some preparation for rrp in OpenAIS. Add facility to assign a multicast address to the cluster. This is (perhaps confusingly) part of the addnodeids subcommand which is really the V1->V2 upgrade command now. Don't go into an infinite busy loop if cman shuts down. 2006-05-09 David Teigland If there are no devices defined within a node's method, that method should be considered failed. Fix from navid@redhat.com Fixes bz #190661 2006-05-09 Abhijith Das fix for bz 190392 + pjc's return 1 if 'status' fails. bz#187279 2006-05-05 Abhijith Das bz 190200 : man page changes for gfs2_tool changes bz 190200: cleaned up gfs2_tool to compile. Removed some features (gfs2 doesn't support them anymore) and commented some out for later implementation. Needs to be reviewed sometime in the future. 2006-05-04 lars Add ha_propagate. CVS patchset: 9469 CVS date: 2006/05/04 15:59:05 --HG-- extra : convert_revision : 0fe93a8368a5dc65815d411d16d212fccd1cd3ed 2006-05-03 David Teigland When mounting a fs, we first join the mountgroup, then tell mount.gfs to procede with the kernel mount. Once we're in the mountgroup, we can get a stop callback at any time, which requires us to block the fs by setting a sysfs file. If the kernel mount is slow, we can get a stop callback and try to set the sysfs file before the kernel mount has actually created the sysfs files for the fs. A new (untested) function delays any further processing until the sysfs files exist. 2006-05-02 David Teigland Untested fix for the case where nodeA fails, nodeB fails while recovering journalA, nodeC needs to recover both journalA and journalB. nodeC wasn't retrying to recover journalA. make some names more descriptive of what they do build all order: cman/lib, ccs, cman An unused recovery set needs to be cleared in either cpg or cman callback, whichever comes second. Last checkin also wasn't adding recovery_set to list when created by cman callback. fix logic that reduces debug output This should fix the trouble caused by nodes rejoining the cluster before they've been recovered (common with fence_manual). There may still be some problem if the unrecovered/rejoined node fails again before the first recovery for it is done. 2006-05-02 Patrick Caulfield Disable VS filter 2006-05-01 alan Added cibadmin man page. CVS patchset: 9446 CVS date: 2006/05/01 05:08:23 --HG-- extra : convert_revision : 378b8da09de7cda6f9c138beddeb010d97490384 2006-04-29 alan Patch to allow the management daemon to not be started automatically if it's not been compiled in. This patch due to Keisuke MORI CVS patchset: 9443 CVS date: 2006/04/29 13:12:20 --HG-- extra : convert_revision : 1aeaf3afc625f9dea4d3cedc93d7343f336984c1 2006-04-29 David Teigland it's not right to use "init" to determine when to save an options message vs processing it; we have to check if we've received a start callback yet looks like we need to build: cluster/cman/lib cluster/ccs cluster/cman 2006-04-28 David Teigland %ll print format for handles in withdraw the group name wasn't being parsed from the table name when multiple nodes failed together (in one event) we were only marking the first as stopped and not the others 2006-04-28 msoffen Changed from looking for just glibtool and libtool to check for the FreeBSD port names CVS patchset: 9442 CVS date: 2006/04/28 19:01:36 --HG-- extra : convert_revision : c24b07de39aaf91f0d8ec0566f6c2a1a85956bf9 2006-04-28 Patrick Caulfield This needs libcman now too Use new OpenAIS with important CPG fix. cpg now tells you /all/ the nodes that left the cluster rather than just the first one. 2006-04-27 lars Fix up path to crm_master. CVS patchset: 9438 CVS date: 2006/04/27 19:54:37 --HG-- extra : convert_revision : 28b12a7edc970b37f31290a13a3c45904c42b84a 2006-04-27 Robert Peterson Initial checkin of libgfs2. 2006-04-27 David Teigland We can get more than one node reporting success in recovering the journal for a failed node. The first has really recovered it, the rest have found the fs clean and report success. Change assertion to recognize that. 2006-04-27 alan Put in a patch to the OCF DRBD resource agent Thanks to Florian Knauf for the fix! CVS patchset: 9437 CVS date: 2006/04/27 18:07:43 --HG-- extra : convert_revision : 53936c5477d8dc5c8a7d154fb27a2dbe1d801ae3 2006-04-27 David Teigland If a bug in a client results in them doing a startdone at the wrong time, catch and report it. When we get the extraneous recovery_done for our own journal when we mount, we need just ignore it. Previous changes to recovery inadvertently added recovery processing to this case. 2006-04-27 Patrick Caulfield Use latest openAIS which, I think, will fix Dave's trouble 2006-04-27 davidlee New variables: preparing for removal of multiple, replicated '-D...' across various 'Makefile.am' CVS patchset: 9436 CVS date: 2006/04/27 14:07:12 --HG-- extra : convert_revision : 6a6475821e9f438f331fcdf628409f9b9e06afa7 2006-04-27 David Teigland When a recovery event arises while processing another recovery event, the first event continues to be processed through the stopping state before it's supplanted. We weren't calling process_current_event() on it, though, so it wasn't going anywhere. 2006-04-26 David Teigland recovery result values for success/gaveup reversed munge debug messages Major changes in handling recovery for failed nodes. Previously, we assumed that if there was a node mounted rw, the journals of any failed nodes would be recovered. The way mounts are done today, the last rw mounter could be unmounting during a recovery which means it may not do recovery. To fix this, nodes now inform each other when they've successfully recovered a journal. 2006-04-25 Patrick Caulfield Inform cman when we have fenced a node. Add API to keep track of when and how a node was last fenced. 2006-04-25 James Parsons Fixed copy - paste error in usage 2006-04-24 Patrick Caulfield Admin sockets can't get notifications either. Fix (I hope) hierarchy descending in CCS. 2006-04-24 lars ... and don't overwrite it with some bogus value if it was set correctly. CVS patchset: 9388 CVS date: 2006/04/24 15:28:38 --HG-- extra : convert_revision : be0bf6329c9ff8f383cd5cf5988339fe7334c600 2006-04-24 andrew got the condition reversed - thanks lmb CVS patchset: 9387 CVS date: 2006/04/24 14:57:00 --HG-- extra : convert_revision : 01419f04bb77bad6012bb4e0e63e927bd1f2ca62 Set OCF_CHECK_LEVEL correctly CVS patchset: 9386 CVS date: 2006/04/24 14:46:15 --HG-- extra : convert_revision : 6f68b2f4e8328db5916ea38cf8e6398069a5a17c 2006-04-24 alan Updated version number... CVS patchset: 9373 CVS date: 2006/04/23 23:59:17 --HG-- extra : convert_revision : 0f628709467fd86b2d4f87cf55360195d110b1b2 2006-04-23 andrew new RA CVS patchset: 9363 CVS date: 2006/04/23 09:47:53 --HG-- extra : convert_revision : 521118751fc431afc17a478c0892592e2a206010 2006-04-21 David Teigland debug statements improve some debug messages 2006-04-21 Ryan O'Hara Fixed compiler warnings. 2006-04-21 David Teigland now need to build cman before ccs 2006-04-21 Patrick Caulfield Change the default port. If people try to run old and new cman on the same port on the same network, things could get /very/ messy. Use new AIS logging functions. 2006-04-20 Robert Peterson Fix for bugzilla bz 179069: gfs_fsck unable to fix file system. This is an extensive fix to repair damaged and corrupt resource groups and resource group index entries that previously caused gfs_fsck to abort. 2006-04-20 Ryan O'Hara Removed magma dependencies. 2006-04-20 davidlee 'findif' invocation needs base interface (e.g. 'le0'), not new (e.g. 'le0:1') CVS patchset: 9339 CVS date: 2006/04/20 16:33:23 --HG-- extra : convert_revision : 3dcef1e77041a7a510330e7d001e2e0d925e0de6 2006-04-20 Patrick Caulfield Don't call lcr_component_register twice as it /really/ annoys aisexec. Also cope with being partially invoked (ie added to openais.conf but not called with OPENAIS_DEFAULT_CONFIG_IFACE set. 2006-04-19 David Teigland Look for gfs_ondisk.h in ../../gfs-kernel/src/gfs/ instead of in the kernel src tree. gfs2_ondisk.h will be in the kernel tree so we can continue to include 2006-04-19 Ryan McCabe renamed to rsb rsb agent, this time named rsb along with a man page and makefile mods 2006-04-19 David Teigland Fix hang if unmount overlaps recovery slightly differently, i.e. if unmount occurs between telling gfs to do recovery on a journal and getting a response. (Last fix was unmount before just before telling gfs to recover a journal.) This should help fix a possible hang caused by a node failing at just the right time during our unmount. 2006-04-19 davidlee Tidy a couple of config-time directory variables CVS patchset: 9319 CVS date: 2006/04/19 15:49:19 --HG-- extra : convert_revision : ea8d91c00632e9f0560a7decad726238d8cb96ec 2006-04-19 andrew Use CTS for checking basic sanity of the CRM and remove the bloody aweful scripts that used to do the job CVS patchset: 9307 CVS date: 2006/04/19 12:22:08 --HG-- extra : convert_revision : c0d899055471801424f1422935a4785de45ab8a4 2006-04-18 alan Fixed another little bug in apache related to handling missing configuration files if they're on a disk which isn't mounted. CVS patchset: 9298 CVS date: 2006/04/18 20:24:01 --HG-- extra : convert_revision : 6a1389adda1690c6a237713f57889dfca2dbbe64 2006-04-18 David Teigland Remove the restrictions on when readonly mounts are allowed. As long as the fs doesn't need recovery, any mount mode (rw/ro/spectator) is allowed. byte-swap id's in inter-node messages 2006-04-18 Ryan McCabe logout correctly on status check 2006-04-18 Patrick Caulfield Use revised ais loading system (that now works correctly) on FC5 2006-04-17 David Teigland If the last rw mounter umounts leaving only ro nodes, the remaining nodes don't need to leave the fs blocked like they do if the last rw mounter fails (and needs journal recovery). This also means that an rw mounter that mounts when there are only ro mounters doesn't need to do first-mounter recovery except when there are actually journals that need to be recovered. 2006-04-17 Ryan McCabe fence agent for fujitsu-siemens primergy rsb device 2006-04-17 David Teigland Add some more tips, and an example cluster.conf 2006-04-16 alan Fixed several minor problems with return codes Also changed the order things were validated so that checks for some "normal" errors come last. CVS patchset: 9285 CVS date: 2006/04/16 06:43:39 --HG-- extra : convert_revision : b2bd294ce5c125cffd273daeab52a0879212f8e3 2006-04-14 David Teigland A start_done() was missing in the case where the last rw mounter leaves and the remaining ro/spect nodes are restarting. 2006-04-13 David Teigland valid jid is >=0 not >0 2006-04-13 Stanko Kupcevic FC5 support, start cs-deploy-tool with --fc5 argument 2006-04-13 David Teigland Add remount support which involves updating our mount mode (ro/rw) on all nodes. Fix 16/32-bit mixup in metaheader that was recently fixed in gfs2. This is a significant rewrite and expansion of the code that manages gfs mounts. For the first time we are considering the mount mode (rw/ro/spectator) in the mount process and how it relates to recovery. In the past, we've not dealt with the issues that come up when ro/spectator nodes cannot recover the fs. This has only been tested up to the point of the previous functionality, there will still be bugs in the new areas. 2006-04-13 Chris Feist Removed most references to magma from gnbd. Removed gulm from Makefile. Removed dlm-kernel & gulm from configure script. 2006-04-13 Patrick Caulfield 'addnodeids' command adds node IDs to nodes that don't have them. 2006-04-13 andrew Apparently sles10 uses GZIP as a variable to store the compression level in and rpminst uses that variable... so we need to use something else so "make rpm" works CVS patchset: 9273 CVS date: 2006/04/13 09:54:19 --HG-- extra : convert_revision : 0c874bc2a1116fddaec710f709bbb66fa48b07aa 2006-04-13 Patrick Caulfield Allow users to specify log level on command line Add recognition of ALTNAME tags in CCS to specify multiple ethernet interfaces to use for communications. Note: The lower ais layers don't fully support this yet... 2006-04-12 Chris Feist Removed gulm directory as it is no longer used. Removed gulm from HEAD. Removed directories not needed for RHEL5. Updated Makefile and configure script to reflect directory removals. 2006-04-12 Patrick Caulfield s/Blocked/blocked/ when displaying quorum. 2006-04-11 David Teigland mount configfs at /sys/kernel/config, not /config mount point for configfs is /sys/kernel/config, not /config 2006-04-11 Patrick Caulfield Explain cna_address 2006-04-11 panjiam cmpi providers will be installed to the path specified in the openwbem config file if --with-provider-dir does not present. if config not found, then install to the path with the same prefix as the cimom's binary (/usr or /usr/local). CVS patchset: 9236 CVS date: 2006/04/11 07:20:47 --HG-- extra : convert_revision : 2c1cbde22e690dd1d539daaa4cc161976d22d24a 2006-04-10 Abhijith Das fixed. Obeys LSB standards for return values 2006-04-10 andrew RA for driving pingd. Probably broken. CVS patchset: 9225 CVS date: 2006/04/10 14:54:46 --HG-- extra : convert_revision : c64184095c9645a7f1adc8ec4f544351d598ed07 2006-04-10 Patrick Caulfield Print a space between IP addresses 2006-04-10 sunjd bug 1152: replace the hardcode pathname CVS patchset: 9198 CVS date: 2006/04/10 09:10:06 --HG-- extra : convert_revision : cf5bbcf87a4926df530762e37bedb5a9675702e8 2006-04-07 Ryan McCabe support switches with greater than 9 outlets, and handle lists that run longer than one screen. 2006-04-07 Patrick Caulfield Do recursive queries on CCS and store in the objdb. (not as easy as it might sound!) 2006-04-06 Patrick Caulfield Remove a (now) spurious include Update to latest AIS. we not don't need to patch the AIS sources as it can be all configured dynamically. 2006-04-05 Stanko Kupcevic GUI touch up 2006-04-05 lars Default certain programs searched for during configure time in case they are not found then, instead of silently substituting empty strings in resource agents etc. CVS patchset: 9100 CVS date: 2006/04/05 13:30:53 --HG-- extra : convert_revision : 62bbd9897a77b75aea5289897e6d990303cf2ccf 2006-04-04 andrew Only compile dtd validation if we have libxml2 CVS patchset: 9075 CVS date: 2006/04/04 13:20:54 --HG-- extra : convert_revision : 2e57582d208cef09b2b9857cb830bb1b13c9811c 2006-03-30 Steven Whitehouse Fix a bug where the wrong size endian conversion was used to initialize certain fields. 2006-03-29 xunsun converted HB WinPopup to ocf wrapper, fixed return code of ocf WinPopup CVS patchset: 9009 CVS date: 2006/03/29 15:48:55 --HG-- extra : convert_revision : 96ddfb35f9c7d3370974a198d5b4353131bdbaa7 2006-03-28 Patrick Caulfield Update to latest openAIS code. 2006-03-28 Benjamin Marzinski If it took too long for an uncached gnbd recvd process to stop after the server was fenced, gnbd_monitor would mistakenly think that the old dying process was a new starting process. Because of this, the client would not auto-restart the gnbd device when the server started back up. This is now fixed. Now, gnbd_monitor will not move a device from the reset state to the restartable state unless the old recvd process has stopped. 2006-03-27 Stanko Kupcevic Improved rpm selection and installation phase 2006-03-27 andrew Update CVS ignore files CVS patchset: 8961 CVS date: 2006/03/27 05:45:29 --HG-- extra : convert_revision : 8f1553660ffb705e03401c1c6bb2728a5f12303a Get the management gui compiling and running under OSX CVS patchset: 8959 CVS date: 2006/03/27 05:42:32 --HG-- extra : convert_revision : 610f615f97a09c6e96034106c90d1a672548de4c 2006-03-25 panjiam new master-slave dummy RA CVS patchset: 8943 CVS date: 2006/03/25 16:38:26 --HG-- extra : convert_revision : 669b15e1692f86e8b3de6f068df631eed0fe7da7 2006-03-24 Robert Peterson Fix for bz 186125: gfs_fsck on GFS 6.1TB filesystem gives error and leaves volume in an unmountable state. 2006-03-24 Abhijith Das cman init script fix for bz 159783. dlm module is modprobed immediately after cman module. Previously, dlm module was loaded after cman_tool join. 2006-03-24 David Teigland If the user specifies hostdata options on the command line, they need to be combined with the standard hostdata options from gfs_controld. 2006-03-23 David Teigland When we exit because the cluster has shutdown, do a force release of the abandoned libdlm lockspace to clear it up. This requires a dlm-kernel change to use force=3 for libdlm's FORCEFREE. When starting up, also clear any old configfs dirs out of /config/dlm/cluster/spaces/. Some may be left over for a defunct lockspace if the cluster shut down before the ls was left. Exit if we get POLLHUP from cman. Exit if we get POLLHUP on cman fd. 2006-03-23 Patrick Caulfield Loads more documentation for each call. Return EBUSY if we don't know whether a remote node "is_listening" or not. Make things a little clearer. 2006-03-23 panjiam updated for 2 new added dummy RAs CVS patchset: 8941 CVS date: 2006/03/23 02:36:59 --HG-- extra : convert_revision : d38d1499e1b8c0f7a2f815b07aad00250f90c451 2006-03-21 Jonathan Brassow - ccs_tool can seg fault on upgrade bug 186121 apcFence { port = 6.2 switch = 1 option = "off" } When ccs_tool comes across the integers, it segfaults because it expects them to be strings. This checkin fixes that. 2006-03-21 David Teigland always reply yes to a cman shutdown, letting the daemons themselves determine when we can leave don't let the gfs_controld lockspace prevent a cman leave 2006-03-21 Patrick Caulfield More documentation on API calls. 2006-03-20 David Teigland Daemons now exit when cman says the cluster is down. 2006-03-20 Patrick Caulfield Fix typo of the year: "quorumdev_poo" Don't get config variables from , just use . Having "config" as a key in a config file is just daft, also confusing and a change from the RHEL4 syntax (not that this version has the same config variables, but there's no point in making it gratuitously different). "make install" now includes AIS headers & libraries. Use the cluster name as the AIS key if no keyfile is specified. This has a couple of advantages: - It allows isolation of clusters by name (as we did before, but /really/ by name this time, rather than a hash of the name). - It avoids the problem of aisexec crashing it it was not running with a key and it encounters a packet that was. Tidy help message, fix cluster name override, tidy "status" code. clvmd no longer needs patching for GFS2 2006-03-18 lars Add OCF RA for Xen guests. CVS patchset: 8907 CVS date: 2006/03/18 00:54:54 --HG-- extra : convert_revision : 7d0eaf0e994405d17fd7d3a222d5cbfa25f7fddd 2006-03-18 David Teigland Reply to cman's TRY_SHUTDOWN callback. 2006-03-17 Benjamin Marzinski Fixing the get_uid code to make it easier to integrate with multipath 2006-03-17 David Teigland Respond to cman's TRY_SHUTDOWN callback. add \n to error message 2006-03-17 andrew Allow the new script to be built CVS patchset: 8904 CVS date: 2006/03/17 18:01:44 --HG-- extra : convert_revision : ce724d93bbcda9f5c792fc79d17fcfc7c4dbefe9 2006-03-17 Patrick Caulfield libcman doesn't ncessarily return a padded sockaddr_storage. But it /does/ return the length. We must register for events. OK, OK, I give in. Fix includes Try and keep ABI stable between versions. sizeof(cman_node_t) should match that in RHEL4 Tidy up. and remove some unused structs & constants 2006-03-17 Benjamin Marzinski fixed gnbd so that it compiles with the upstream kernel. 2006-03-17 David Teigland - If, when mounting, we receive the nodeid/jid message before processing our first start, save the message to process after. - Release withdraw locks for remaining nodes when we get the terminate callback for our leave instead of immediately when we initiate the leave. Otherwise it can conflict with the releasing we do as we process other nodes' unmounts that might be processed before our own. 2006-03-15 Patrick Caulfield libcman changes for last one. Need to fix my CVS repository. two-phase cman shutdown. calling cman_shutdown() will send a REASON_TRY_SHUTDOWN event to all clients registered for notifications. They should respond by calling cman_replyto_shutdown() to indicate whether they will allow cman to closedown or not. if cman gets 1 "no" (0) or the request times out (default 5 seconds) then shutdown will be cancelled and cman_shutdown() will return -1 with errno == EBUSY. 2006-03-15 Benjamin Marzinski This is the gnbd code that is required for getting multipathing to work on top of GNBD. 2006-03-13 Patrick Caulfield Don't send extra state change message. 2006-03-11 xunsun removed the useless function CVS patchset: 8855 CVS date: 2006/03/11 04:02:58 --HG-- extra : convert_revision : 8c4503ffbf12d522a5a8247a4f6a367706870ef1 2006-03-10 David Teigland sort listed groups by level sort members listed for each group by nodeid When nodes are removed from the cluster, remove their dir in configfs. DLM changed to close lowcomms when this happens (it was wrongly closing lowcomms connection when node was removed from lockspace.) Add debug output showing the changes we get from cman. 2006-03-10 zhenh add a header file path so we can compile IPv6addr.c CVS patchset: 8849 CVS date: 2006/03/10 09:33:30 --HG-- extra : convert_revision : c0b7aecbf4ebf29547edbe0d0c7a3249a22a9652 2006-03-10 xunsun backed out the former patch, renamed run() to ocf_run() CVS patchset: 8848 CVS date: 2006/03/10 04:14:20 --HG-- extra : convert_revision : b5866c63972beae40a41d789c1f417ce444496cd 2006-03-09 David Teigland lock_dlmd now called gfs_controld changing lock_dlmd to gfs_controld in various places Rename the daemon binary from "lock_dlmd" to "gfs_controld". gfs_controld interacts with the cluster infrastructure on behalf of gfs file systems and controls/drives recovery of gfs in the kernel. The old name was misleading, it looked related to the dlm instead of gfs. The old name came from the fact that gfs_controld goes through the lock_dlm module when controling gfs. Enable withdraw functions by default since libdlm locking is now working fine. 2006-03-09 xunsun removed checking for openssl headers CVS patchset: 8845 CVS date: 2006/03/09 13:42:04 --HG-- extra : convert_revision : 0329ce82bb8bbf26b1c4d402aa834c9f3070a1c0 *moved run() function to ocf-shellfuncs *added logic to support RA states OCF_RUNNING_MASTER and OCF_FAILED_MASTER CVS patchset: 8844 CVS date: 2006/03/09 12:55:10 --HG-- extra : convert_revision : ed2830ca509a522386003f114e1574282076bcae put in code to respect the broadcast parameter CVS patchset: 8843 CVS date: 2006/03/09 11:29:10 --HG-- extra : convert_revision : a165169f396bdb3dab28277c29f4e1ac81a5349e 2006-03-08 andrew Bug 197 - Checksum the CIB on disk Detect changes made by admins - particularly while Heartbeat is running CVS patchset: 8829 CVS date: 2006/03/08 22:24:29 --HG-- extra : convert_revision : 72e99ca9b2f0c06ee8b43889369955218e03a633 2006-03-08 David Teigland - don't process new non-recovery events while recoveries are still pending (recovery_set's exist) - track progress of recovery sets and free them when they are complete - before the start stage in recovery, make sure that we've received a cman callback for all the failed nodes so we're certain to have the correct quorum status 2006-03-08 Benjamin Marzinski Reorganize this directory to match the rest of the cluster tree. 2006-03-08 Patrick Caulfield Don't return an error if we had to create the control device! Return something slightly better than EEXIST if we can't create it. Odd things could happen if the lock space already existed but the device didn't. No longer. 2006-03-08 Benjamin Marzinski initial commit of csnap kernel code with a useful build structure. The code itself is exactly as Daniel left it. patches 00001 and 00002 are required to build the module. 2006-03-07 Patrick Caulfield Use latest AIS with CPG integrated into it. Fix "make uninstall" & "make clean" 2006-03-06 David Teigland Purge messages that get queued after we're added to the cpg group but before to our own join is processed. Otherwise, these messages will just hang around forever. 2006-03-03 David Teigland Because cpg leaves are processed asynchronously, we can't use the cpg being changed to send messages; the actual cpg membership may not reflect the nodes we need to send/recv messages with. Stopped and started messages sent during async confchg processing now go through a separate cpg that the groupd daemon joins itself and connects it to all other groupd daemons. Overlapping leave events now work (like you get if multiple nodes run fence_tool leave at about the same time), or leaves that occur while the group is processing a join. 2006-03-03 Patrick Caulfield Patch up so they compile. (mainly taking out query calls) 2006-03-03 davidlee 'configure.in' and 'lib/mgmt/Makefile.am' (building on 'acinclude.m4'): more reliable setting of python include dependencies CVS patchset: 8818 CVS date: 2006/03/03 12:42:59 --HG-- extra : convert_revision : 2ea91de08ca210e87e13c0634efbabce685d7c61 2006-03-02 David Teigland - add version numbers to messages - byte swap messages - adjust event id in recovery events that include multiple failed nodes - Add extended event information that group_tool -v will print - All nodes now generate event id's for each confchg based on the nodeid and event type. Every message is tagged with the event id it relates to so groupd can process only messages related to the current event. Needed when confchg's back up. api changed to unsigned id 2006-03-02 Steven Whitehouse Change the .gfs2_admin dir such that its no longer linked from the root directory, but becomes a separate root in its own right. If you want to use this version of GFS2, then you need the latest kernel sources from kernel.org. Note that the change made to the metadata is such that older filesystems will continue to mount ok, so updating mkfs is optional. 2006-03-02 David Teigland Fix how we dispatch cpg callbacks; dispatch one per poll() event. cpg_dispatch(ALL) breaks things if there's more than one confchg since we need to process each confchg before the next, and a loop around cpg_dispatch(ONE) doesn't work because it will block when there's nothing more to dispatch. 2006-03-02 Patrick Caulfield Remove some assertions that could fail in normal circumstances. 2006-03-01 David Teigland check that the mount point exists and is a directory 2006-03-01 Patrick Caulfield Make sure p->ls is NULL when we start up. OR we could crash at shutdown. (lon: you might want to check this in other versions) Small reversion. The sequence number was pointless. change SaNameT to struct cpg_name. add a sequence number to the confchg callback. don't crash when program exits after a cpg_leave. 2006-02-28 David Teigland - command string needs to be "setid", "set_id" wasn't recognized - improve some recovery debug messages - Basic recovery works. - Need to retry sending cpg messages when RETRY status is returned. - Finish bits for doing recovery in the middle of join/leave/recovery, not tested yet. 2006-02-27 David Teigland More descriptive debug messages. Check errors in various places where they were being ignored. Add -w option to disable the withdraw feature. Force it to be set since the dlm locking it uses isn't working. set the width of the name field dynamically so state info won't go past 80 columns as often Add event state to the set of group info we provide, and have group_tool print it so we can now tell what state the group is in, e.g. adding/removing/recovering nodes. 2006-02-27 Patrick Caulfield Don't clear too much ! 2006-02-27 David Teigland missed a line which broke the compile 2006-02-27 alan Changed the version number. CVS patchset: 8808 CVS date: 2006/02/27 16:26:50 --HG-- extra : convert_revision : cd4ef7ed125599b4039684911d7a72632c1a709f 2006-02-27 David Teigland check for errors returned from group_dispatch() return an error from group_dispatch() if the read doesn't return the correct amount of data when a cpg connection is closed, also clear out the poll client entry for it Uncomment withdraw stuff that uses libdlm. don't need to zero out end of addresses any more 2006-02-27 Patrick Caulfield Make errno handling a little more consistent As a courtesy, zero the whole node address field. Return correct status sign for fundamental errors. Return the correct address length for a node. NOTE: This changes the protocol version number between libcman and cman, so you MUST reabuild libcman and anything statically linked to it. 2006-02-27 panjiam moved the CIB data from CM_LinuxHAv2.py.in into CIB.py.in CVS patchset: 8804 CVS date: 2006/02/27 09:19:42 --HG-- extra : convert_revision : 6e65a4c40a7f242c7160c3ea04ce253ccd539a13 2006-02-24 David Teigland Only send jid's to new mounters, not everyone. - When we leave a lockspace, remove the configfs entries for it. - When the daemon starts, remove old configfs entries under comms/ add debugging to determine if the mount syscall is stuck note to help clean up in case umount(8) doesn't call umount.gfs2 - also build libdlm - mkfs type required: -t gfs2 (and -t for mkfs.gfs2) - umount type required: -t gfs2, or umount.gfs2 won't be called - util-linux 2.13-pre6 version of umount(8) is required, older versions won't call umount.gfs2 helper munge header for group listing use "gfs" instead of "lock_dlmd" as the group type printed by group_tool don't need to include dlm kernel headers 2006-02-24 Patrick Caulfield Now compiles against upstream (-mm kernel) DLM. Update patch so it works with new build system Don't need to override ${sbindir} anymore Fix some errors in usage. In particular, CLVMD /can/ be used with the new software - it just needs to be patched before building. Oops, slight bug in that Makefile there. This is a bit neater. The download is now always a clean OpenAIS tarball, and we patch it after download. Install and run cman from libexec Return an error if range locks are attempted. 2006-02-24 Steven Whitehouse Prevent "uninitialized block" message printing out since several blocks don't have the metadata header now. 2006-02-24 David Teigland Fixing leaves. update for new components 2006-02-23 David Teigland Fix the check for us being in the fence domain, libgroup isn't setting the "member" field at the moment, so just look through the member list. Fix leave processing, leaving node can't wait for stopped messages from remaining nodes. Make global id consistently unsigned. match debug messages with the same ones in other daemons Comment-out withdraw stuff which depends on libdlm which doesn't match upstream dlm since we removed ranges. Remove some old state stuff that doesn't exist any more 2006-02-22 David Teigland Send messages (used for distributing journal id's and plocks) through libgroup which passes them through to libcpg/openais. libcman is now only used to check our nodeid and the cluster name. clean out a few unused bits Munging to get this compiling again; need to define new annotated types used in gfs2_ondisk.h, and need our own gfs2_xyz_in() functions since gfs2 now puts these in its own ondisk.c file. 2006-02-22 Steven Whitehouse Added a simple set of compile/install instructions. 2006-02-22 Patrick Caulfield as per lon: add cman_get/set_private routines. Remove some redundant admin routines ln -sf doesn't do what I though it did. so rm the symlink before creating the new one. Use latest openais - with improved CPG support. 2006-02-21 David Teigland remove unused file 2006-02-20 David Teigland code for passing through messages, setting global id's, and some changes to how recovery events will mix with other events bits for sending messages to the group 2006-02-20 Patrick Caulfield Get rid of some redundant stuff Don't do endian conversion on nodeids, ais does that for us now. 2006-02-20 Wendy Cheng Bugzilla 182057 - patch 3-3: Add dump_stack() into gmalloc so we could know the culprit whenever out of memory loop occurs. Bugzilla 182057 - patch 3-2: GFS was trying to split a full-grown directory (0xffff entries) hash leaf into two and subseqently hang. The buffer requirement 0xffff*sizeof(uint64_t)/2 = 262144 (256K) was too big for kmalloc to handle. Change into vmalloc if kmalloc fails. Bugzilla 182057 - patch 3-1: Fixes directory delete out of memory error. Found in customer environment where gfs_inoded is deleting a max size of hash unit (0xffff entries). It hangs in leaf_free() during gmalloc while kmallocing 0xffff*sizeof(uint64_t) (=512K) of memroy. It did a kmalloc, zeroed out the buffer, then copied the zeroed contents into bh buffer and subsequently sent the bh into gfs_writei to write out to disk. This patch removes the unnecessary kmalloc plus the memory copy by directly zero out the bh buffer. 2006-02-17 David Teigland more progress, can now process a join misc fixes 2006-02-17 xunsun Put date in mail subject (Joachim Banzhaf ) CVS patchset: 8738 CVS date: 2006/02/17 13:33:08 --HG-- extra : convert_revision : 8b21aceef16a71a716f0e92290517dd06a87bbdb 2006-02-16 David Teigland rework things so we should be closer to passing message delivery through this lib handful of fixes 2006-02-16 Abhijith Das fix for bz178812. ccsd init script and daemon print errors now 2006-02-16 David Teigland misc fixes getting it to work 2006-02-16 Patrick Caulfield Detach the daemon rather better 2006-02-16 David Teigland include headers from ../../cman/daemon/openais/trunk/include 2006-02-16 Patrick Caulfield Check if ccsd is running before attempting to start the daemon. Otherwise the user just waits around for nothing to happen. Allow stderr log messages from ais if "cman_tool join -d" specified. WARNING: Don't use this on a serial console with full debug messages enabled or it will repeatedly timeout ! Give aisexec longer to get started. Now that all the CCS accesses are in the daemon it can take some time to get going. Run cman as an AIS component. This allows other AIS services to be used in conjunction with it - using their normal API. This checkin includes a few small hacks to openais to make it run without a configuration file - cman will supply all the config information to it from CCS. It also includes a new AIS service: cpg. This is a closed process group service I'll be submitting upstream soon but is currently used by groupd. As there is no AIS config file read by the daemon the services used are hard coded in openais/trunk/exec/mainconfig.c as: evs, clm, cpg, ckpt, evt, lck, msg, cman 2006-02-16 xunsun make the Raid1 RA more mdadm friendly (Ranjan Gupta), and other fix CVS patchset: 8718 CVS date: 2006/02/16 08:36:31 --HG-- extra : convert_revision : b2f7a3050ff789a35c7065ded8ba1cc0e09d0203 2006-02-15 David Teigland Put this code back into the correct state. (This is not supposed to be the same as the version on RHEL4 or STABLE branches. This is new.) add nodeid to message deliver callback Remove group_join/leave arg. Need to find a new method for detecting if new mounter is spectator. Remove group_join/leave arg. Removed group_join/leave arg. Remove the "info" params to join/leave and other related functions. Add the API for sending/receiving. New groupd that uses the cpg (closed process group) service from openais. It still uses cman, but only to get the local nodeid and quorum values. 2006-02-15 Wendy Cheng Joined work of bugzilla 164331 (Abhijith Das) and 178469 (specsfs): While granting exclusive lock, gfs_glock_cb() expects all other threads have relinguished their writes and journal has been flushed and shutdown. Otherwise it aborts the call and forces a filesystem consistency error. The current umount code (gfs_put_super) doesn't follow this logic by doing flushes without log shutdown before the exclusive lock is requested. The patch works around this issue by relocating the flushes into gfs_make_fs_ro() call itself after the gfs_glock_nq_init() call. Properly handle error return code from verify_jhead(). 2006-02-15 Robert Peterson Fix for Bugzilla Bug 178453 – Slow memory leak in /proc/cluster/dlm_dir and /proc/cluster/dlm_locks 2006-02-15 James Parsons Addresses interface change in drac_mc firmware version 1.2 2006-02-14 Steven Whitehouse This reverts the dirent ondisk structure to be the same on disk as gfs1. mkfs is now uptodate with the git tree head. 2006-02-14 andrew A common transitioner library - Abstracted from the TE code - Provides common functions like unpacking, printing and destroying - Provides a dummy set of action handlers to test the generated transition CVS patchset: 8681 CVS date: 2006/02/14 11:32:12 --HG-- extra : convert_revision : 8342bf49fafaf8ecb862622366e9ba85f64a86a7 2006-02-11 xunsun cleanup CVS patchset: 8676 CVS date: 2006/02/11 14:31:14 --HG-- extra : convert_revision : 4d93bf11b0e42eb553d2963e17b2b345fbd67ce4 cleanup CVS patchset: 8675 CVS date: 2006/02/11 14:21:20 --HG-- extra : convert_revision : 07bb00fc6cc04b3581527efbfe9792517ea5984f add the ability to deal with sub-mounts (Toby ), and fix a typo in heartbeat Filesystem RA CVS patchset: 8674 CVS date: 2006/02/11 13:40:42 --HG-- extra : convert_revision : e6903eb24b4766a68efe74349aa839e2fccf62c7 2006-02-10 James Parsons Support for drac 4/I 2006-02-09 Wendy Cheng This patch is part of fix for bugzilla 178469 where the 6th word of the nfsd file handle doesn't get correctly byte-swapped in gfs_decode_fh(). It is normally not a problem if the dentry of the file still hanging around in server's cache. However, under heavy IO, the dentry could get re-cycled and the parent info would be used to re-do lookup. We get stale file handle error as the result. 2006-02-09 James Parsons Fix for bz168698 2006-02-08 James Parsons man page for fence_rsa 2006-02-08 Steven Whitehouse This takes mkfs out of the main build system. It should make it much easier to build it since its just a question of editing the Makefile to point at the kernel source and then, make, make install. Also the binary will be called mkfs.gfs2 and make install will stick it in /sbin so that you can call it through the mkfs generic front end like all the other filesystems do. So hopefully this will be easier to build and easier to use. It also incorporates the changes required to make filesystems for the new journaled file format, so be aware that only the very latest git tree of gfs2 will work with it. 2006-02-08 James Parsons fixed typos 2006-02-06 James Parsons rsa support 2006-02-03 James Parsons added explanation of new auth type switch 2006-02-02 Lon Hohberger Agent fixes for #178314 fix #179662 2006-02-02 Jonathan Brassow - Make the init script do a cman_tool leave remove on stop. Restarts still do cman_tool leave 2006-02-02 andrew Fix the default CVS patchset: 8562 CVS date: 2006/02/02 14:21:08 --HG-- extra : convert_revision : 0cdf5a1b0c1bb2e490bdd8ce488fbcd55400c0ee 2006-02-02 Benjamin Marzinski This fixes a problem from bz #173697. gfs_fsck crashed on many types of extended attribute corruptions. Now gfs_fsck correctly deals with them. 2006-02-01 alan Put in code to keep malloc from using mmap to get memory, and also to keep it from returning it to the system. CVS patchset: 8540 CVS date: 2006/02/01 05:35:24 --HG-- extra : convert_revision : 4eb668aea0f5db03a3754939c33762892a5d8fce 2006-01-31 Patrick Caulfield Make nodeid mandatory 2006-01-27 Lon Hohberger Fix 179063 - some options missing from nfsclient option handler Fix 178249 - pass 2 2006-01-26 Robert Peterson This fixes Bugzilla bz 178367. Memory leak when reading from either /proc/cluster/nodes or /proc/cluster/services. 2006-01-26 lars Some here documents weren't using the proper tokens (and confused my syntax highlighting); also, redirecting to stdout is redundant. CVS patchset: 8521 CVS date: 2006/01/26 18:00:05 --HG-- extra : convert_revision : c8d706e0361d2cdbe88f49f768b1f112a32b19f0 2006-01-26 James Parsons Added fence_bladecenter back in to install list 2006-01-24 Lon Hohberger Fix broken build Fix #178249 - debug messages from gulm.so 2006-01-24 Patrick Caulfield Update to 0.71 of openais. 2006-01-24 Abhijith Das bz127042 fix: kill gnbd_monitor when all uncached gnbds have been removed 2006-01-23 Patrick Caulfield Show nodeid in "cman_tool status" output. Don't allow a user to release a lockspace if other users have it open. 2006-01-20 Lon Hohberger Fix #166109 - random segfault in clurgmgrd. Fix most of 177467 - clustat hang (does not fix the case where the lock manager never responds to a request). Fix bug in smb.sh associated with ccs descriptors > 255. Fix 178026 - provide lock holder on request from SM. Fix 178080 - implements work around dlm release 177934 by taking a NULL lock on lockspace acquire. Add CLK_HOLDER flag: Tells the plugin to return an allocated uint64_t; #178024 - also fixes incorrect API documentation in man page 2006-01-18 davidlee Detect availability of streams (fallback purposes only); also of Solaris 10+ socket/stream credentials. CVS patchset: 8478 CVS date: 2006/01/18 17:51:32 --HG-- extra : convert_revision : 0854897c04ac551df67cc93519bc4336a20fd136 2006-01-17 David Teigland update gfs2 description 2006-01-17 Stanko Kupcevic Remove pvs from list of available storage; don't create VG if no storage configured 2006-01-12 Jonathan Brassow - endian fixes so heterogenious clusters can work WTR CCS - thanks to Fabio Di Nitto, and pjc 2006-01-12 xunsun remove an extra line, and better indention CVS patchset: 8442 CVS date: 2006/01/12 16:41:12 --HG-- extra : convert_revision : e5314a7a50e2d1de3145bd165d0d8fb5692d409e 2006-01-12 Steven Whitehouse GFS2 no longer writes mh_blkno into the common metadata header for metadata blocks that it creates itself. This completes the removal of the mh_blkno field. GFS2 doesn't check it at all, so it should be backward compatible with GFS1 now. mkfs no longer writes the mh_blkno field into the common metadata header since its not needed. Removed the bmap ioctl() call from live.c since its ownly purpose was to allow setting of this field. The second half of yesterday's patch. This is the recovery part and now means that gfs2 no longer relies on the mh_blkno field of the common metadata header. All normal (i.e. not journaled) file reads/writes now go via the page cache route, even if they are to stuffed inodes. 2006-01-11 Steven Whitehouse Remember to take out my debugging printks :-) Write a list of the block numbers of the succeding metadata blocks which are about to be committed into the log. This uses the spare space at the end of the log descriptor block, so mostly it doesn't actually require any more disk I/O to do this, and when it does, its very little. This doesn't actually change recovery (yet) since the recovery side is still making use of the mh_blkno field in the metadata header still. The intention is to use the list of block numbers instead. This has a slight speed advantage in recovery since we no longer need to read any block which is in the revoke list, since we will have already read a list of the block numbers ahead of time. Once this is done, we'll no longer need to use the mh_blkno field in the common metadata header which gives two advantages: o We don't need the bmap ioctl() since its only use is the online version of mkfs o We don't need to rewrite all the metadata headers when converting gfs1 to gfs2 I plan to use an almost identical system to deal with data blocks in the journal, with the addition of an extra field to flag whether the data block begins with the GFS magic number or not, which is required so that we can escape any data blocks which do start with the magic number. When thats done, it will greatly simplify dealing with journaled data files. 2006-01-11 Jonathan Brassow - commit Lon's patch Contains: - Fix that causes connection desc's to timeout, eliminating the problem where all desc could be used-up, even though they are not in-use. - Increase # of connection desc from 10 -> 30 - Ignore SIGPIPE - Don't call printf from a signal handler - Don't catch SIGSEV, allowing for core dumps - build with -ggdb - applied and compile tested 2006-01-11 Lon Hohberger Merge ccsd local socket patch from RHEL4 / STABLE 2006-01-10 alan Changed Fileystem to not log status if it's called as a monitor operation. Also noted a bug about validate-all. CVS patchset: 8415 CVS date: 2006/01/10 17:32:35 --HG-- extra : convert_revision : 172916ed2cd41922e303f8abdd2dc8691eb3d939 2006-01-10 Patrick Caulfield Add quorum device interface back in. 2006-01-10 Steven Whitehouse Found and fixed an endianess bug where we were using the wrong conversion function (one I introduced earlier I think). Some clean ups which reduce redundant copying and also reduce endianess warnings. Removed some now unused functions. 2006-01-10 Patrick Caulfield Use libcman 2006-01-09 David Teigland Copy the get/set vfs<->internal casting macros from stable branch to get rid of the piles of compiler warnings. make lock modules use modified harness interface and header from gfs printk prefix GFS instead of GFS2 in merged harness code Incorporate the lock_harness into gfs itself, just as was done for gfs2. GFS1&2 share the same lock module registration interface for now so the same modules can be used with both. Munging and renaming related to the merging of lock_harness into gfs. For the time being, we copy the gfs1 lock module bottom interface so the same lock modules can be used with both gfs1 and gfs2 (it won't be possible to load both gfs1 and gfs2 at once.) Eventually the lock modules will fork for gfs1/gfs2 and the register interface can change to the gfs2_ prefix. 2006-01-09 James Parsons Fix for bz176375 regression Fixed typing designation for var 2006-01-09 Steven Whitehouse Remove a test which is no longer needed. 2006-01-06 Patrick Caulfield file 51-udev-dlm.rules was initially added on branch RHEL4. 2006-01-04 Lon Hohberger Fix 176343 - __builtin_return_address(x) for x>0 is never guaranteed to work 2006-01-04 Patrick Caulfield udev rules file for DLM 2006-01-03 Patrick Caulfield Allow non-root users to create the default lockspace. Fix race where a lock could be unlocked before dlm_lock has completed. Put externs in the header file. 2005-12-22 davidlee Omission from previous update CVS patchset: 8354 CVS date: 2005/12/22 15:03:30 --HG-- extra : convert_revision : 725ef2c6e70bb417d03422178be907c431d45956 Rev 1.481 had switched the default of "--enable-mgmt" from "no" to "try". But configure wasn't testing for GNU/TLS, which "mgmt" requires. So it was now attempting, and failing, to build on systems lacking it. 1. Test for "gnutls/gnutls.h". 2. Script already complex; the above would have made it even more so. Simplify structure, taking advantage of "WarnMissingThing" macro. CVS patchset: 8353 CVS date: 2005/12/22 14:49:56 --HG-- extra : convert_revision : d8b09a7be3d1be69f774a3d5fee37de42242b50e Setting "ECHO*" must be near end, otherwise it interferes with the output of AC_MSG*() macros (both direct invocations, and indirect such as AC_MSG_HEADER). Tidy up uses of "echo" (ECHO*), deprecated by autoconf, to recommended forms, e.g. "AC_MSG_NOTICE()". A few related (linguistic/cosmetic) tidy-ups. CVS patchset: 8352 CVS date: 2005/12/22 12:22:39 --HG-- extra : convert_revision : f3df33ebdb5227a30e6fbb7548333c0999d07e32 2005-12-22 Steven Whitehouse Update mkfs to use the new ioctl for get/set of flags for files. 2005-12-22 zhenh fix the check of python-devel CVS patchset: 8340 CVS date: 2005/12/22 03:25:29 --HG-- extra : convert_revision : 8ae7bcf56fe85d79106178e43b9f132e5f930cc0 2005-12-22 sunjd let configure do more check for building management module CVS patchset: 8339 CVS date: 2005/12/22 00:28:53 --HG-- extra : convert_revision : b556f9ad1e37ed3f6e6a9d867af6cbf843d892f6 2005-12-21 Steven Whitehouse Adding back the get/set flags ioctl, which is the one ioctl that I think we really need. This time its called with the argument being a pointer to the flags to be got/set rather than through the original vectored ioctl call. Remove some unused code relating to printing the lock state. If we are to use an fs based interface to revive this feature on user request (the current code is only triggered by a bug occuring) then the code I'm removing here won't be useful anyway. Merge the lock harness into gfs since it no longer makes sense to retain it as a standalone module. Also drop most ioctls() although I've left the header file for now so that the userland utils will still build. Live operations on the fs will not be possible until the userland utils have been ported over to use the new interface(s). 2005-12-21 zhenh modify some required field and default value based on lmb's comments CVS patchset: 8311 CVS date: 2005/12/21 05:36:52 --HG-- extra : convert_revision : f211332d619ee26572d76c6e9d1a1be1148a34c6 2005-12-20 davidlee Portability of '-O' of 'test' (binary vs. sh-builtin) CVS patchset: 8307 CVS date: 2005/12/20 18:08:24 --HG-- extra : convert_revision : 97bf1b280e3257fd3f942522b460e3611771772b 2005-12-20 zhenh add required field to parameters, add description of RA CVS patchset: 8304 CVS date: 2005/12/20 08:34:47 --HG-- extra : convert_revision : f4dd48d96ff4b7473ab457f5a20258b5965c38e6 2005-12-20 sunjd remove the redundant item CVS patchset: 8302 CVS date: 2005/12/20 08:04:21 --HG-- extra : convert_revision : 51f988de9be089388e70e1c96c67d50a2cfd3f65 2005-12-19 David Teigland Check if allocate_lockinfo() fails. typo: not &'ing DLM_LKF_PERSISTENT with the flags 2005-12-19 xunsun minor changes CVS patchset: 8297 CVS date: 2005/12/19 15:11:25 --HG-- extra : convert_revision : bbdb06fb52812c43a26ea1da821138a6a05e4e29 2005-12-19 andrew Remove as much crud from the IPaddr script as possible. Where arguments are passed to function - use them. Avoid the use of global variables in helper functions. Confirmed all functionality on Linux and Darwin/BSD. CVS patchset: 8296 CVS date: 2005/12/19 13:06:57 --HG-- extra : convert_revision : 215337b653a029d02a4462d60533492f8e05d4a1 2005-12-17 andrew The calculated netmask was broken on Linux too. CVS patchset: 8289 CVS date: 2005/12/17 09:36:34 --HG-- extra : convert_revision : 08f41e370014375a3890a479b94894f22ff042d9 Do not overwrite the value for NICINFO calculated in the conditional block Use the calculated NETMASK in later calls to $FINDIF Pass -C to $FINDIF so the calculated NETMASK is of the correct form CVS patchset: 8287 CVS date: 2005/12/17 07:00:39 --HG-- extra : convert_revision : 68f77ac85ce7b94f928e5105bff24c76fd3528ac 2005-12-16 Abhijith Das fix for bz169087 - split fill_super_block() into read_super_block and fill_super_block. calling block_mounters between calls to the two sb functions 2005-12-16 Stanko Kupcevic add path to logfile to error popups 2005-12-15 Stanko Kupcevic Verify that node is subscribed to proper RHN channels 2005-12-15 Steven Whitehouse Some __read_mostly annotation. mkfs changes to allow building of filesystems with visible .gfs2_admin directory. This change makes the "hidden" files and directories visible to users of the filesystem. Basically it interchanges the order of the root and .gfs2_admin directories. There are still some more changes to come in this area, but this is a good start. Again you'll need to upgrade mkfs with the changes I'm about to check in to build a filesystem with this new feature. The plan is that this will remove the need for a large number (all?) of the current ioctl()s but I'll leave them where they are until I'm sure this is working correctly. 2005-12-14 Steven Whitehouse These are the fixes to mkfs in order to get it to build against the new big-endian GFS2 kernel source. I'm not intending to fix the other utilities until I've finished the ioctl() changes as that will break them again for a short while. If you do need them in the mean time, mostly its a question of adding linux/types.h to the include files list, s/le/be at suitable places and using the copy of ondisk.c now checked in here. Long term I'd like to abstract the things in ondisk.c and make a libgfs which all the utilities can share rather than each of them having their own copy of them. This touches just about everything and changes GFS2 to be big-endian on disk, like GFS1. At this point, most of the major data structures will be compatible in layout and endianess. There are a number of more minor items which are still not compatible, largely in areas where there really is a good reason to make changes. There will probably be further metadata changes to come, not least when I look at the ioctl question, which is next on my agenda as it will also require user tool changes. All the user tools in CVS for GFS2 are now broken until I check in some changes there (5 mins max I promise!). 2005-12-12 lars If the path to the CMPI headers isn't part of the include search path, the other headers may not be includeable. Simplified Makefile.am (because it's now always -Included). CVS patchset: 8268 CVS date: 2005/12/12 21:57:41 --HG-- extra : convert_revision : b68d1e9f019bb25d4924c8150a95c7925b4e3c22 Typos and a redundant configuration directive. CVS patchset: 8267 CVS date: 2005/12/12 21:33:41 --HG-- extra : convert_revision : a6b87e3e783cff06dd7f91b8c5366c16189995a5 2005-12-12 Patrick Caulfield I hate CVS, why does it always ignore this directory? Add flags to 'extra' for showing 2node & error states. Show 2node flag & error state in cman_tool status Some of those messages should go to syslog 2005-12-12 sunjd add MSGFMT CVS patchset: 8260 CVS date: 2005/12/12 05:14:48 --HG-- extra : convert_revision : 1a54773ca7f2c4ceed2fd93c3e96b7e8d88c364a 2005-12-10 Benjamin Marzinski Fix for bz 142849 When you do gfs_write on a stuffed inode, you don't update the page cache, because the inodes are stored in the buffer cache. This doesn't effect reads, because gfs special cases the stuffed reads. Unfortunately, sendfile needs to use the page cache, because it relys on the destination socket's sendpage routine to work. So my fix is: after you do a write on a stuffed inode, if the first page of the file is cached (It appears from looking at the code that there is already an assumption that stuffed inodes will never be more than a page in length) mark the cached page as not uptodate. 2005-12-10 gshi change /bin/bash to /bin/sh CVS patchset: 8255 CVS date: 2005/12/09 23:30:30 --HG-- extra : convert_revision : 4e03391ca3657ec5129980159b279a43f0c418bf 2005-12-09 Stanko Kupcevic Sample snmpd.conf Make sure all data is read() from buffer before closing fd Added rhcClusterNodesNames, rhcClusterAvailNodesNames, rhcClusterUnavailNodesNames, rhcClusterServicesNames, rhcClusterRunningServicesNames, rhcClusterStoppedServicesNames, rhcClusterFailedServicesNames to rhcCluster, so that users of clients that don't display whole snmp table in a single view (eg. HP OpenView), can see all failed/stopped/... services in one place. Node and service tables haven't been removed. Also, clarified descriptions. 2005-12-09 gshi bug 338: Make quorum architecture pluggable CVS patchset: 8252 CVS date: 2005/12/09 20:15:31 --HG-- extra : convert_revision : 9b8993a54272c9acb928b9720d55f9cfd5b121d1 2005-12-09 Lon Hohberger Fix #175033 part 2 - read in page-size chunks from /proc/cluster/services 2005-12-09 Steven Whitehouse A checkpoint in the sparse annotation. There are still a number of areas producing warnings. Two out of three (the vectored ioctl() and the walk_vm() function and friends) are already on Dave T's hit list so I didn't spend a lot of time sorting them out. The third (jdata) will need some more work. There is also further work to be done adding more annotations, particularly for the ondisk metadata but this is a convenient point to check in what I've done so far. Fix sparse warning caused by use of int rather than gfp_t in a function argument. 2005-12-08 Lon Hohberger Bump SM plugin version Fix #175033 - magma-plugins incorrect read behavior for /proc/cluster/services > 4096 bytes 2005-12-08 Patrick Caulfield Allow re-reads of CCS to set a new expected_votes value. If we can't read CCS then pause this node until we can (maybe need some form of retry in here) Fix some prototypes. 2005-12-08 zhenh a command line mgmtd client CVS patchset: 8241 CVS date: 2005/12/08 15:11:04 --HG-- extra : convert_revision : 44d69644f2023e6b68e2fe63d1ca06add83f99c0 2005-12-08 Patrick Caulfield Oops, typo. Also tidy comments so they fit on an 80char line. Reinstate lowcomms_close() to tidy up the output queue when a node leaves the cluster. Truncate any partial messages that may be left in the input buffer if a node goes down under memory pressure (thanks to Mark Butler for pointing this out) 2005-12-08 Steven Whitehouse Removed the hfile_trunc ioctl(). Its not used anywhere at the moment, but I've kept the function itself (will produce a function not used compile warning) just for the moment as I've a hunch it will be useful later via a different interface. Tidy endianess conversion in lvb.c. Half the macros were not used anyway. We no longer convert endianess for the structure padding, and the printing code will no longer print its value. This is a precusor to a longer look at the endianess conversion and addition of sparse annotations. Fix up warnings due to kernel function prototypes not matching a couple of GFS2 functions. This patch moved the i_alloc structure into the incode inode. This results in fewer memory allocations (one less per write) as well as removing another of the __GFP_NOFAIL allocations from the code. I've left the get/put functions associated with i_alloc as it might be useful to retain them for debugging and because it maps out the lifetime of this structure. Temporary variable "error" is not required since there is already a variable being used for that purpose. This patch removes an allocation of memory which was occuring on every lookup of a block in a file. The memory requires is now allocated on the stack and its only 20 bytes. A lot less than some of the other stack allocated structures we are using (e.g. struct gfs2_holder) in the code. It should speed things up and removes one of the __GFS_NOFAIL allocations that we need to remove from the code. Fix to quota bug fix... I had accidently commited the RHEL4 patch in head as well. This is the proper fix for the head branch. 2005-12-08 sunjd Add an error message CVS patchset: 8236 CVS date: 2005/12/08 02:52:17 --HG-- extra : convert_revision : 1c2b949f0edb0633620ed1a469fdfada17988159 2005-12-07 Stanko Kupcevic Include pegasus headers for zSeries 2005-12-07 Lon Hohberger Fix #175229 - remove unneeded references to clurmtabd; it is no longer a necessary piece for NFS failover Implement 175215: Inherit fsid for nfs exports 2005-12-07 Stanko Kupcevic agent and provider READMEs 2005-12-07 Patrick Caulfield Call out to group_tool for "cman_tool services". A bit of backward compatibility is always nice :) 2005-12-07 Stanko Kupcevic Replaced tmp OID with unique one 2005-12-07 Patrick Caulfield Tell aispoll to remove the client if it errors. 2005-12-07 sunjd add configurations for management modules CVS patchset: 8234 CVS date: 2005/12/07 09:47:32 --HG-- extra : convert_revision : b1f6ad56c99677c17795f92481e20bcee9b52162 2005-12-06 Stanko Kupcevic Signal-safe logging Resources used after their release 2005-12-06 James Parsons Fix for bz168698 2005-12-06 Lon Hohberger Fix #175114 - rgmanager uses wrong stop-order for unspecified resource agents Fix #175108 - rgmanager storing extraneous info using VF Fix #175106 - lsof -b blocks when using gethostbyname causing slow force-unmount when DNS is broken Fix #174819 - clustat crashes if ccsd is not running Fix #173916 - rgmanager log level change requires restart Fix #173526 - Samba Resource Agent Fix #171236 - pass 1 - ia64 alignment warnings Fix #171153 - pass 1 - clustat withholds information if run on multiple members simultaneously Fix #165447 - ip.sh fails when using VLAN on bonded interface 2005-12-06 Patrick Caulfield sockets should be non-blocking! Comment tidy CCS node names override temporary ones created from the IP address. Recalculate quorum after re-reading CCS. Sort nodes by nodeid 2005-12-06 Steven Whitehouse Remove an unused variable. A two line patch to fix a bug where the quota_enforce setting is ignored. Thanks to: Marc Curry for reporting the bug and testing the fix and Chris Feist for building the RPMs for Marc to test the fix 2005-12-05 Stanko Kupcevic Restart daemons on upgrades, compile with debugging info Abort command execution after 3 second Memory corruption due to libxml2 not being thread safe 2005-12-02 Stanko Kupcevic Memory leak in Socket, remove pidfile on exit, catch SIGCHLD spec and code cleanup added logging with multiple levels of verbosity 2005-12-01 Patrick Caulfield Update to latest AIS (taken straight from svn now) Make thread count configurable. 2005-11-30 Patrick Caulfield (re-)read the two_node flag from CVS so we can change it on the fly. 2005-11-30 Stanko Kupcevic Daemonization of clumond 2005-11-29 Stanko Kupcevic specfile, dependencies, buildsystem 2005-11-29 Robert Peterson Fix for Bugzilla Bug 155304 – gnbd_monitor doesn't correctly reset after an uncached gnbd has failed and been restored. 2005-11-29 Steven Whitehouse Sorry - another ondisk format changing check in... I've reordered the fields in the common metadata header and added a pad field so that this structure is now of identical size as in gfs1. It also has identically functioning fields in the same place. One new field is added, mh_blkno which contains the block number of the block in question. This is instead of the generation number used in gfs1's journaling and now no longer needed. The mh_blkno field is checked only once in the code and could be moved into the structure where its required, rather than being left in the common header. (I checked this with Ken who added the field in the first place) I have a thought that it might be a useful feature to have all ondisk metadata checksummed in which case this header field might be just the right place to put it. Also some padding is added to the dinode, sb and rgrp structures in order to make them compatible with the earier gfs1 versions. Again fields are now in the same places as in gfs1 with padding added where required since the fields have got smaller. I don't believe that there will be any significant performance impact in changing these structures, but it should make things much easier when it comes to migration. There are other structures (I think that the various directory metadata comes under this heading) where removing the padding is likely to improve performance significantly and so I'm intending to take a different approach there. There is likely to be at least one more metadata change for gfs2 in the not too distant future. My next main concern is to make gfs2 bigendian on disk, to match gfs1. Beyond that, all the other changes should be much easier to cope with. I did a test recently and found that given a few 32bit fields to copy, adding endian coversion made about 12% difference to the speeed of copying (which we are doing anyway). If we keep the two versions with the same on disk endianess then a lot of the metadata will not need conversion at all (given the changes in this patch). So an upgrade will be quite quick, rather than needing to rewrite each and every metadata disk block. This patch introduces the tty_write_message() function rather than doing the dereferences directly. Make this look like the code for ext2/3. I'm not sure about the test that I've left in the code here. ext2/3 doesn't have it and I suspect we don't need it, but I've left it for now just in case. 2005-11-25 davidlee Improve PF_ARGV_TYPE setting. Ensure is PF_ARGV_NONE for Solaris. (Also bugzilla 967.) CVS patchset: 8190 CVS date: 2005/11/25 15:19:34 --HG-- extra : convert_revision : 1f56b072fa37eac9e39ab83dd771049222b1e950 2005-11-24 Stanko Kupcevic build system 2005-11-23 Stanko Kupcevic clumond c++ rewrite 2005-11-23 Steven Whitehouse N.B. This patch changes the ondisk format of GFS2, so you'll need to remake filesystems. The idea here is to make the metadata closer to that of GFS1 in order to reduce the amount of work required to migrate GFS1 -> GFS2. This patch changes the numercial value of a few constants to match those of identical function in GFS1. One of the GFS1 constants is no longer used in GFS2 (as noted in the comments) and there are two new constants added. As far as I can tell there are no more constants in GFS2 which are not comaptible with those in GFS1. All the remaining ondisk incompatibilities are due to structures changing size and I'll be taking a look at those next. 2005-11-22 Stanko Kupcevic Package docs, build 0.9.2 Addition of html documentation 2005-11-21 Lon Hohberger Allow scripts to inherit the name attr of a parent in case the script wants to know it (#172310) Fix #162605 2005-11-21 David Teigland Check in Wendy Cheng's fix: Flush pages into storage in case of DirectIO falling back to BufferIO. Check in Kevin Anderson's mount sync patch: Without the patch, there is a large performance hit when using the -o sync mount option on gfs filesystems. The problem is that pages are not being flushed to disk when the gfs_writepage routine is invoked due to the transaction not yet being completed. Ref: bugzilla 173147. 2005-11-18 alan Minor improvement to the OCF ocf_is_root() function... CVS patchset: 8170 CVS date: 2005/11/18 05:48:40 --HG-- extra : convert_revision : b98a0e628846e467e5828fcbbe16715269487fd6 Applied a patch to apache from Mizutani Koji to allow for both the AddModule and to LoadModule parameters as the same. CVS patchset: 8169 CVS date: 2005/11/18 05:41:00 --HG-- extra : convert_revision : c7c17454b3012399c98cfa83e32c1c683fc211f6 2005-11-16 Stanko Kupcevic RPM prerequisites & alpha build Message to add two nodes in order to detect shared storage Tooltips and touch-ups Display size in GBs 2005-11-15 David Teigland When using lock_dlm, all gfs2 unmounts would panic in invalidate_inode_buffers (<- invalidate_list <- invalidate_inodes). Invalidate_inodes requires that the sb inodes list not change while it's running, but async unlock completion callbacks from lock_dlm are scheduled during invalidate_inodes. These callbacks do glock_put() which does a final iput(), clearing the inode and causing the panic. The fix is a new semaphore (sd_invalidate_inodes_mutex) which blocks glock_put's during invalidate_inodes. 2005-11-15 xunsun by this fix it should work for systems who have id in PATH CVS patchset: 8157 CVS date: 2005/11/15 04:22:30 --HG-- extra : convert_revision : 0791605047843f25d297f43f9cd7de083f2f6797 2005-11-14 Patrick Caulfield Don't spam ccsd with diff config entry names. add htons() etc on nodeid in totem_ip_address. 2005-11-14 xunsun sh portability fix CVS patchset: 8150 CVS date: 2005/11/14 06:21:24 --HG-- extra : convert_revision : f9904a4b696ba9d62139cfe699b6b560ed17046b 2005-11-13 davidlee Portability fixes; ensure "sh" is Bourne-compatible; avoid bash extensions CVS patchset: 8148 CVS date: 2005/11/13 17:16:43 --HG-- extra : convert_revision : 8c444bb86b0360466d328d9fe385bb88b6a21e16 simplify the character testing CVS patchset: 8147 CVS date: 2005/11/13 17:00:02 --HG-- extra : convert_revision : fff895d4bca2c60549c22efa849b58ff69a316ec 2005-11-12 xunsun use either ocf_is_hexadecimal or ocf_is_hex, but not both CVS patchset: 8145 CVS date: 2005/11/12 14:28:17 --HG-- extra : convert_revision : 8af878613710a30b2cbd90e6ea82f972e9b3fba1 2005-11-12 Stanko Kupcevic Minor GUI touch ups 2005-11-12 David Teigland remove option debugging 2005-11-11 David Teigland deal with gfs1/gfs2 differences better don't restrict to gfs2 2005-11-11 davidlee Some bash-isms were making the OCF "sh" scripts non-portable. Also identified some common code blocks, and separated these out into "resources/OCF/ocf-shellfuncs.in". Files in this update: resources/OCF/ocf-shellfuncs.in resources/OCF/IPaddr.in resources/OCF/IPaddr2.in resources/OCF/drbd.in CVS patchset: 8140 CVS date: 2005/11/11 16:15:02 --HG-- extra : convert_revision : 79e07f7e41ff733b106d718f59b73fa7bbbf13ec 2005-11-10 David Teigland debugging output and options 2005-11-10 Patrick Caulfield new AIS incorporating nodeid zeroing patch that got missed. 2005-11-10 Stanko Kupcevic Initial checkin of clumon 2005-11-09 David Teigland mount.gfs/umount.gfs are mount.gfs2/umount.gfs2 extraneous \n Code to add/del gfs entries from /etc/mtab; junk I was hoping to avoid but can't. 2005-11-09 davidlee Solaris/pkg: make pkg name a least-evil compromise: {ConfigureMe configure.in} CVS patchset: 8123 CVS date: 2005/11/09 14:20:12 --HG-- extra : convert_revision : 122732dce73a7ea713e1684c8d0bb4a741c7fad7 2005-11-09 panjiam minor change for TIPC CVS patchset: 8121 CVS date: 2005/11/09 10:53:11 --HG-- extra : convert_revision : 636183bea3239d49d6c388acf8357be7613d8163 updated for commnunication module CVS patchset: 8119 CVS date: 2005/11/09 10:39:00 --HG-- extra : convert_revision : 93a2e86fcea3675bef90e2ca22c6e6534ee6c46b 2005-11-09 Stanko Kupcevic Initial checkin of cs-deploy-tool 2005-11-08 David Teigland look in /proc/mounts instead of /etc/mtab for gfs mounts add fixme comment Share common code between mount/umount; both about finished. 2005-11-08 Patrick Caulfield Read key after we've filled in the whole of the totem_config struct, as it now seems to depend on bits of it. If AIS passes us a node ID then we should beleive it. Also cope with not being able to resolve IP addresses at startup. If AIS passes us a nodeid that matches then we can use that one. new AIS Fix log message that could crash daemon Don't show joined time for non-members. 2005-11-08 xunsun do not output the "* Program is not running" message CVS patchset: 8105 CVS date: 2005/11/08 02:08:57 --HG-- extra : convert_revision : aeb1503ba41e77ac0627e84178e87d8b04cdb363 2005-11-07 David Teigland Every file should #include the headers containing the prototypes for it's global functions. Signed-off-by: Adrian Bunk dlm_find_lockspace_name is now the static one Remove some unused functions, make others static. Signed-off-by: Adrian Bunk 2005-11-07 James Parsons Fixes bz172464; adds WTI RPS10 agent to build 2005-11-07 xunsun = is more POSIX compatible than == CVS patchset: 8093 CVS date: 2005/11/07 15:44:22 --HG-- extra : convert_revision : 686229dbeb00fadf514c8c4d7cec9bcc5b53c559 = is more POSIX compatible than == CVS patchset: 8092 CVS date: 2005/11/07 15:22:09 --HG-- extra : convert_revision : 37637463fbe684ad80ed028ecd3bc0f89033ba68 2005-11-07 sunjd bug937: add another judgement; use the more portable style provided by alan CVS patchset: 8087 CVS date: 2005/11/07 06:00:33 --HG-- extra : convert_revision : cae531b4d306ebcf86e77651fb05085710790ba0 2005-11-06 sunjd correct a typo CVS patchset: 8083 CVS date: 2005/11/06 17:00:43 --HG-- extra : convert_revision : 1a4f8ba3ad475d0064d2451f93072eec4c4c6e08 2005-11-04 panjiam added byteorder check CVS patchset: 8075 CVS date: 2005/11/04 15:58:51 --HG-- extra : convert_revision : 422366521e6f29c81a8253eeed18573f435c4116 2005-11-04 Lon Hohberger Fix for 172441 from jparsons 2005-11-04 panjiam updated for new cim tools CVS patchset: 8072 CVS date: 2005/11/04 07:20:35 --HG-- extra : convert_revision : f32677ff4451bdeef072ca2cd4b79f7d75d08c73 2005-11-04 xunsun Oops, typo CVS patchset: 8068 CVS date: 2005/11/04 06:28:22 --HG-- extra : convert_revision : dccb2631003e9dfe389a246dd77b8c2a97b632fa Bug 937 -- split arguments within IPaddr, to work with IPaddr OCF RA CVS patchset: 8067 CVS date: 2005/11/04 06:19:08 --HG-- extra : convert_revision : 49004c0dcbebf6a40e1430a48f866bc1c1ec1c9d 2005-11-03 Lon Hohberger Fix #172401 2005-11-03 David Teigland don't use patches any more not sure why these are still here 2005-11-03 xunsun @prefix@ cleanup and return code fix CVS patchset: 8059 CVS date: 2005/11/03 10:32:47 --HG-- extra : convert_revision : d3e4ba1b0c92a96266a40c8a8c2b352600363cf4 these trailing \'s are significant, I should not had removed them several days ago CVS patchset: 8057 CVS date: 2005/11/03 09:36:14 --HG-- extra : convert_revision : 35efd56d5db38aeed08b0be285f4ef7379a408bf add wrapper for IPv6addr OCF RA -- bug 327 CVS patchset: 8054 CVS date: 2005/11/03 07:42:49 --HG-- extra : convert_revision : d9149f37b1a9982175c218eccbf575088a6d951d fuser refinement: killing harder eventually, but do not incur sleep CVS patchset: 8053 CVS date: 2005/11/03 07:10:04 --HG-- extra : convert_revision : 4c61e934393ddc4b3c3bf782aeea1eefe23f118a changing the direct specification of an awk into an @AWK@, as suggested by David Lee CVS patchset: 8048 CVS date: 2005/11/03 06:44:18 --HG-- extra : convert_revision : aa23bf3066a34d4d3b77f8940b6ae0511e5588b6 2005-11-02 Lon Hohberger Fix rest of 172178 2005-11-02 horms Don't use gawk specific --source CVS patchset: 8031 CVS date: 2005/11/02 05:57:33 --HG-- extra : convert_revision : 6815538da2ed8bf9a02cf2736849f81e5e2be547 Use awk instead of gawk as there doesn't seem to be anything gawk specific in there CVS patchset: 8030 CVS date: 2005/11/02 05:57:15 --HG-- extra : convert_revision : ba16cf03c9e0412b83291f840ab0c081c608061c 2005-11-01 Lon Hohberger Fix bugs 172177, 172178 2005-11-01 panjiam removed hardcode path, added --with-provider-dir option for CIM CVS patchset: 8016 CVS date: 2005/11/01 07:46:29 --HG-- extra : convert_revision : 1e506aa2f6551cc358421b1558d02f2081fc9b1a 2005-11-01 xunsun OCF_RESKEY_incarnation_no is not user-configurable, thus we can not validate it CVS patchset: 8013 CVS date: 2005/11/01 03:32:41 --HG-- extra : convert_revision : 02915e59abc9d6a854a2dee0986801ae3ac62181 2005-11-01 panjiam made it consistent with CIM/register.sh CVS patchset: 8012 CVS date: 2005/11/01 03:14:25 --HG-- extra : convert_revision : 74a2fbd1efe1b3ded9eb8ffc296f80065e65e4db 2005-11-01 sunjd add thread safe option CVS patchset: 8010 CVS date: 2005/11/01 02:56:36 --HG-- extra : convert_revision : a1d68a178f6ffed1ee733970525565c5ac77d924 2005-11-01 Alasdair G. Kergon test commit 2005-10-31 David Teigland do mount option munging on a copy of gfs's hostdata buffer build mount.gfs2 and umount.gfs2 Write our own reduced option parsing routine and don't bother using parse_opts() from util-linux/mount -- no need for util-linux at all now. 2005-10-31 Alasdair G. Kergon test checkin 2005-10-31 David Teigland split depends line 2005-10-31 Lon Hohberger Apply patch from Axel Thimm to fix bz172066 2005-10-31 xunsun make it explicit that the fstype OCF instance parameter is optional CVS patchset: 7982 CVS date: 2005/10/31 06:08:30 --HG-- extra : convert_revision : d107a37edc6f6a5b775197c85be3e183a1aa39ad bug 920 -- samba support for Filesystem RA CVS patchset: 7979 CVS date: 2005/10/31 03:15:18 --HG-- extra : convert_revision : 97dd296a52cf3c564d1dbe6db045a45bcf384c68 remove the uncomfortable trailing /s CVS patchset: 7978 CVS date: 2005/10/31 02:59:54 --HG-- extra : convert_revision : 0159faaab593f73c128dc41f0fd1866c67b3ac44 2005-10-29 andrew Detect the native UUID implementation present in recent Darwin releases If -luuid is required it will be already present in $LIBS so remove explict usage from Makefiles CVS patchset: 7972 CVS date: 2005/10/29 08:43:06 --HG-- extra : convert_revision : a29b896145e57ef20a37c327e6ac9ff397433aab 2005-10-28 David Teigland On unmount just leave the lockspace and exit, user space umount.gfs will leave the mount group umount helper need EXPORT_SYMBOL_GPL Pass the correct mount options. Accept connections for mount requests from mount.gfs. Do mount processing when we get this request instead of waiting for a mount uevent from the kernel. Return hostdata string to mount.gfs that's passed to gfs through mount(2). jid/id/first values previously set in sysfs now passed in hostdata. Remove "do mount" uevent message to user space and waiting for "mounted" reply. Now assumes that gfs mount helper has done everything already. 2005-10-27 Lon Hohberger Ensure rgmanager doesn't block SIGSEGV when debug is not enabled. 2005-10-27 David Teigland default to 1 vote if no value given in cluster.conf untested interaction with lock_dlmd added 2005-10-27 gshi make node deletion work 1. maintain a delhostcache file for deleted files 2. hb_delnode to delete a node hb_addnode to add a node TODO: 1) make hb_delnode/hb_addnode accept multiple nodes 2) make deletion only works when all other nodes are active 3) make CCM work with node deletion CVS patchset: 7956 CVS date: 2005/10/27 01:03:21 --HG-- extra : convert_revision : f26c6e40b0a8400cfa72a837706ad246ad989aac 2005-10-26 andrew Define 2 new return codes so that we can discover master/slave resources correctly. Include discriptive comments about their usage. Support them in the PE. CVS patchset: 7948 CVS date: 2005/10/26 11:34:49 --HG-- extra : convert_revision : 7cf7d09e44ec3d33a3875de0b2e73e25b39dd949 2005-10-25 panjiam added configure options for CIM CVS patchset: 7938 CVS date: 2005/10/25 07:29:14 --HG-- extra : convert_revision : fe4198cf69ba33156effec38e44894555b22a801 2005-10-24 gshi add Makefile in config CVS patchset: 7935 CVS date: 2005/10/24 18:27:16 --HG-- extra : convert_revision : 08592d6f602a9e6d922392bb74aec76cecafab03 2005-10-24 David Teigland The start of a mount.gfs2 program that is called by mount(8). This allows us to do things in user space (like interact with the cluster infrastructure) before mount(2) is called. None of the clustering bits have been added here yet. The mount(8) source is a real mess and it's not simple to integrate an fs-specific helper without duplicating a pile of the mount(8) code within the helper program. Currently the Makefile downloads a modified version of util-linux, builds util-linux/mount and then compiles mount.gfs2 using util-linux/mount/libmount.a. A better solution may require a lot of cleanup in util-linux/mount, unfortunately. Add comment with Ken's quota summary. move quota sync and refresh to sysfs 2005-10-24 Jonathan Brassow - millennium latency fix for 2.6 2005-10-24 sunjd (Done by Sun Xun) * explicitly distinguish block device from other things(NFS mount point, -L or -U options for mount). A variable is added for this purpose, blockdevice={yes|no}. * when [ $blockdevice = "yes" ], refine the Filesystem_status check. * add (replace $MOUNT check with) /proc/mounts check, since this tend to be more accurate then $MOUNT output. * refine return status of Filesystem_status. CVS patchset: 7932 CVS date: 2005/10/24 15:16:09 --HG-- extra : convert_revision : 4b983d9958d2653ccffa9998775daaffd516bef2 (Done by Horms in 1.3, poted by Sun Xun) To solve the temp file vulnerability CVS patchset: 7930 CVS date: 2005/10/24 15:08:31 --HG-- extra : convert_revision : bdc810efdd5f74b2e9d7f00279254d8bc1331422 2005-10-24 panjiam updated for CIM CVS patchset: 7917 CVS date: 2005/10/24 05:10:00 --HG-- extra : convert_revision : 056c1352a5353f8b7774f5b53a3fa498e4771376 2005-10-21 Lon Hohberger Fix #171253 2005-10-21 andrew Revert David's ltdl changes - they break the build for non-developers. CVS patchset: 7915 CVS date: 2005/10/21 07:21:42 --HG-- extra : convert_revision : 3dce01f34bb0c1db2a39c78a0a5761911b927b1d 2005-10-19 lars Return correct exit code for stopped resource. Spotted by Stefan Peinkofer Convert echo to ocf_log. CVS patchset: 7898 CVS date: 2005/10/19 19:09:43 --HG-- extra : convert_revision : cf771b5ecd9911c7200ed777642e006d9a3d3c82 2005-10-19 davidlee Fix a LIBS problem in the getpid test CVS patchset: 7896 CVS date: 2005/10/19 08:40:00 --HG-- extra : convert_revision : bcd314ba1dad1b5d028ff891de32a9764fc0b2de 2005-10-18 Lon Hohberger Mono-NFS server resource agent 2005-10-18 sunjd ( Done by Sun Xun ) * added "status" operation, so that OCF wrapper can call it CVS patchset: 7878 CVS date: 2005/10/18 08:20:47 --HG-- extra : convert_revision : e02cf875ea5855a8ef3b685284ee469408748c8d 2005-10-18 gshi if the test failed, then we define GETPID_INCONSISTENT not the other way around CVS patchset: 7871 CVS date: 2005/10/17 22:37:03 --HG-- extra : convert_revision : 5c42a7bdd5836aad38afa179c705fcdc29f1e7fd modify pidtest according David Lee's suggestion make pidtest.c return non-zero on failure, 0 on success move pidtest.c from tools/ to config/ use AC_TRY_RUN to test run the program CVS patchset: 7870 CVS date: 2005/10/17 22:29:00 --HG-- extra : convert_revision : 4a7bee49aa5d894bebcdf774c1b99936b2295e59 2005-10-17 Lon Hohberger Add logging library Mono-NFS server support (e.g. one NFS server per cluster, active-passive). Add support for inheritance in the form "type%attribute" instead of just attribute so as to avoid confusion. Fix 150346 - Clustat usability problems Fix 170859 - VIPs show up on multiple members. Fix 171034 - Missing: Monitoring for local and cluster file systems in... Fix 171036 - RFE: Log messages in resource agents 2005-10-14 zhenh add haclient.py CVS patchset: 7854 CVS date: 2005/10/14 20:30:44 --HG-- extra : convert_revision : 242ce3b6a1df5ed9f21a90e4b990ce4c14b4f454 2005-10-13 David Teigland get block size from sb instead of defunct statfs comment out bits that used "statfs" entry in sysfs that's now been removed... need to find another way of getting at least some of that info /sys/fs instead of /sys/kernel base dir in sysfs is now /sys/fs/gfs2 register gfs under /sys/fs instead of /sys/kernel. requires the kernel to be patched with fs_subsys.patch declares fs_subsys for /sys/fs/ Port plock management from lock_dlm kernel module to lock_dlm userland daemon. Simpler as it runs over cman comms instead of dlm range locks. 2005-10-13 davidlee Improve Solaris pkg installation locations CVS patchset: 7831 CVS date: 2005/10/13 11:36:54 --HG-- extra : convert_revision : 56188312326b3a13f09633304c0f3441f8dcfbaa 2005-10-13 sunjd fix the EAGAIN error CVS patchset: 7829 CVS date: 2005/10/13 09:27:16 --HG-- extra : convert_revision : 8f47cf31729286149458728e2856f7f66fa0b39b 2005-10-12 gshi add getpid() to configure.in and disable the pid match check for a client if the test does not pass in configure.in CVS patchset: 7819 CVS date: 2005/10/12 19:32:45 --HG-- extra : convert_revision : cbd6e7bd74fae04334f0db3e8f64215d0e313343 2005-10-12 David Teigland use ALIGN instead of MAKE_MULT8 last checkin for list_for_each_entry_safe_reverse was incomplete Use list_for_each_entry_safe_reverse. Requires the kernel to be patched with list-safe-reverse.patch. adds list_for_each_entry_safe_reverse to list.h Add a couple comments. 2005-10-12 sunjd ( Done by Sun Xun ) * track the status of the pseudo resource CVS patchset: 7804 CVS date: 2005/10/12 09:06:35 --HG-- extra : convert_revision : a10ef4cf527de098fcbf4a54a21d8d3667f40c5b (Done by Sun Xun) * fix a bug that "$!" become meaningless when the "sleep and bell" stuff are performed within foreground subshell * become robust against empty pid file(this may happen due to some failure) * remove unused code CVS patchset: 7803 CVS date: 2005/10/12 08:09:57 --HG-- extra : convert_revision : b129b8d0b910d75465164ea2ce25ca1a559eb844 2005-10-11 David Teigland Showing extended statfs info in sys set a couple people off, just removing it instead of wasting my time arguing with people. We can try to put it back later on its own. file WHATS_NEW was initially added on branch STABLE. use list macro use list_for_each 2005-10-11 sunjd ( Done by Sun Xun ) * track the status of the pseudo resource by $VARRUN/WinPopup CVS patchset: 7799 CVS date: 2005/10/11 09:55:40 --HG-- extra : convert_revision : 43f0d7309d0b72b4137290c506f623a12ce39bd1 2005-10-10 David Teigland refresh for latest -mm remove some log_debug's add a static 2005-10-07 Patrick Caulfield Slightly better error handling. Pull slightly updated openais tarball. 2005-10-07 David Teigland withdraw if the lock module returns LM_OUT_ERROR missing include 2005-10-06 David Teigland if the lock module returns LM_OUT_ERROR, withdraw from the cluster When the dlm returns an error, return LM_OUT_ERROR to gfs instead of panicking; gfs can then try to withdraw. define LM_OUT_ERROR for a lock module to return 2005-10-06 Benjamin Marzinski Update gfs2_tool to make use of sysfs instead of ioctls where appropriate. 2005-10-06 Patrick Caulfield Set the version number 2005-10-06 lars "monitor" op should only return OCF_NOT_RUNNING for stopped resources, not if the monitoring failed. Cleaned up the code of the affected function. CVS patchset: 7788 CVS date: 2005/10/06 11:08:01 --HG-- extra : convert_revision : 59a14ba6c7211412c682f428fa6a2e4a7e62681a 2005-10-06 Patrick Caulfield Don't default to 0 votes, it's annoying. 2005-10-05 Lon Hohberger Don't build clufindhostname; it's not even in the tree 2005-10-05 David Teigland need smp_lock.h for lock_kernel() don't include the kernel's lm_interface.h, print lm flags value, don't try to interpret 2005-10-04 Lon Hohberger Incorporate patch from 165447 Clustat partial rewrite (still needs updated XML output) 2005-10-04 David Teigland update FIXME comment Add comment explaining how queue_empty() is used. Use kref structs and operations to do glock reference counting instead of our own counter. 2005-10-04 horms Conditional ldirectord build CVS patchset: 7764 CVS date: 2005/10/04 09:25:09 --HG-- extra : convert_revision : 043d4ed0878250be4ec036480f8510ec3aacfe49 2005-10-03 Patrick Caulfield Stick a version number in the pipe protocol so we can protect ourself from future incarnations of tools. Don't die if we get SIGPIPE Add DLM_LKF_FORCEUNLOCK so device.c doesn't have to muck about with locks that are in progress. 2005-09-30 Patrick Caulfield Lock ourself in memory Don't send two config change events for a "cman_tool expected -e". Tidy kill message. Use inet_ntop() to print IP addresses so the look nicer. 2005-09-30 sunjd configuration update for CIM providers CVS patchset: 7742 CVS date: 2005/09/30 09:11:38 --HG-- extra : convert_revision : 0f0dbfbdd45080c54267501e76bff7bbad309c8f 2005-09-30 Patrick Caulfield "cman_tool status" prints the multicast address too. Use slightly more RFC-compliant multicast address. Install the AIS keygen program. 2005-09-29 sunjd (Done by Tim Verhoeven - dj@rootshell.be) - Changed some regex's to be more robust to whitespace - Added a extra case if the pid file is defined as a relative path to the serverroot - Fixed a ocf_log command CVS patchset: 7736 CVS date: 2005/09/29 08:10:12 --HG-- extra : convert_revision : 1c2f0a468172d6496f10f9877356deb1afd78bfb 2005-09-28 David Teigland move shrink and statfs_sync from ioctl to sysfs handle sysfs dirs for both gfs1 and gfs2 2005-09-28 lars The check was trying to be too smart and thus failed. CVS patchset: 7725 CVS date: 2005/09/28 08:33:40 --HG-- extra : convert_revision : b55ac2eb871f8a2a47513b8bb97eac1f3bd28fc8 2005-09-28 David Teigland update to work with new sysfs organization (doesn't work for gfs1 yet) add \n missed the sys_fs_uninit at unmount 2005-09-27 David Teigland add a couple basic sysfs attributes 2005-09-27 Benjamin Marzinski gfs2 was unable to truncate files if they were opened with write permissions, but the process later changed uids to an uid that didn't have write permission. POSIX says that if you had the permissions when you opened the file, you don't loose them when you change uids. gfs2 no longer checks permissions before truncating a file in gfs2_setattr. I don't know what this check was ever added (It didn't exist in gfs 6.0). Since I don't know why the check was added, I can't say for certain that it is unessential. However, I can't see any reason to do it, so I deleted it. 2005-09-27 David Teigland The fs kobject is now passed into the lock module at mount time. It's ignored by nolock and gulm. Kobject added to gfs1 so the same lock module can operate with both gfs1 and gfs2. Pass the kobject for the fs into lm_mount() so the lock module's kobject can appear below it in sysfs. 2005-09-27 lars Propagate the command output. (Duh!) CVS patchset: 7718 CVS date: 2005/09/27 15:25:53 --HG-- extra : convert_revision : 2ea9e378ed56f9aa39acd3581d074b6d801709da 2005-09-27 Patrick Caulfield rename "commskey" to "keyfile" as that's slightly better. remove redundant entry Don't overwrite AIS node addresses just because a nodeid matches Keep nodes in ID order 2005-09-27 horms Resolve syntax error CVS patchset: 7701 CVS date: 2005/09/27 03:04:52 --HG-- extra : convert_revision : 59aedae9b44f34befbe60f22be8b317b4731c4ad 2005-09-26 Patrick Caulfield Fix usage message 2005-09-26 sunjd (Bug417) Done by Sun Xun Add the validate-all action and other polish CVS patchset: 7694 CVS date: 2005/09/26 08:49:00 --HG-- extra : convert_revision : 87428ac99900751b1058d560a5e969599512098f (Bug 417: Done by Sun Xun ) Add the validate-all action and other polish CVS patchset: 7693 CVS date: 2005/09/26 08:46:45 --HG-- extra : convert_revision : d20c2e52f30775af18ac40776e32a3bb83359edb (Bug 417: Done by Sun Xun ) *validate-all *allow the user not to specify fstype *do not require instance parameters when doing "meta-data", and "usage" *misc cleanup CVS patchset: 7692 CVS date: 2005/09/26 08:44:30 --HG-- extra : convert_revision : 8aa679a10e55cece9cf519e599a3eac903d8b47e 2005-09-23 Patrick Caulfield Tidy multicast code and use a suitable (ip4 or 6) default if none is specified. 2005-09-23 lars Some more debugging and cleanups. CVS patchset: 7673 CVS date: 2005/09/23 12:58:29 --HG-- extra : convert_revision : db180c6234bd2c7c12780a9d8ab5de719b2449b3 Notification support updated to reflect reality. Minor bugfixes here and there. CVS patchset: 7672 CVS date: 2005/09/23 12:42:40 --HG-- extra : convert_revision : 1c7d620401af46a8215b4dc82f5d8bb8a0873949 2005-09-23 Patrick Caulfield Allow CCS version to go forward between cman_tool join & startup, but not backward. Don't put an incarnation number on a node newly arrived from CCS. Add uS to the log timestamp Updating ccs_tool's editconf commands for new cman schema (note: the old schema will still work [nodeids permitting] but the new one is slightly simplified) 2005-09-22 alan Putting back the previous version of configure.in, now that I've put the necessary change in for 2.0.2 CVS patchset: 7668 CVS date: 2005/09/22 20:22:12 --HG-- extra : convert_revision : e1c9dd13a9174ae103fe33fe05623507b6fda5e7 Had to undo some changes in this to allow 2.0.2 to go out... Sorry! CVS patchset: 7666 CVS date: 2005/09/22 20:11:45 --HG-- extra : convert_revision : 6045bb60f1e6b51eb84e326b2e6f4bebdaf16bbf 2005-09-22 Benjamin Marzinski add watchdog.o to the rgmanager object files, to fix undefined reference to watchdog_init. 2005-09-22 alan Put in 2.0.2 update lines. CVS patchset: 7658 CVS date: 2005/09/22 16:29:33 --HG-- extra : convert_revision : f91a03b865c45f3326cf6cf629d85383d9c9e9a6 2005-09-22 Benjamin Marzinski switched 'deamon' to 'daemon' in uninstall rule. 2005-09-22 lars More changes to sketch the crm_master hint call. CVS patchset: 7655 CVS date: 2005/09/22 13:42:45 --HG-- extra : convert_revision : 34a89ef52e102d46289df8b168c4237dfe0a047f One more step towards master/slave-aware drbd. CVS patchset: 7653 CVS date: 2005/09/22 10:35:55 --HG-- extra : convert_revision : e817973077ac6084595ee3e9e56847020cbf78fb 2005-09-22 Patrick Caulfield Cope with node names in CCS that we can't resolve (provided they have a node ID) and warn if odd things happen. 2005-09-21 David Teigland move a bunch of stuff from ioctl to sysfs minor fixups 2005-09-21 Patrick Caulfield Don't try and do floating point maths in the preprocessor. Use a temp variable for the node address, to avoid potential alignment problems. new AIS version byteswap the header too Need to return earlier if the socket failed to connect. Refresh cluster FD before each select. If any of the queues have cached message in then return an fd for /dev/zero from cman_get_fd() - that way it will be guaranteed to be active for read so we will get called back to flush those queues. Also, use the new GETNODECOUNT call to get the nodecount rather than getting the whole node list. Missing comma & comment. 2005-09-21 zhenh move haresources2cib.py.in from cts/ to tools/ CVS patchset: 7630 CVS date: 2005/09/21 06:48:24 --HG-- extra : convert_revision : 10a30e3871441e82fa2c120b4a7420a471c30b06 add management tool directories CVS patchset: 7627 CVS date: 2005/09/21 05:34:39 --HG-- extra : convert_revision : 9b41276d792333e5746f8665c8057d63dbf15e47 2005-09-20 Patrick Caulfield Updated patch that works rather better. 2005-09-19 Patrick Caulfield Patch for clvmd so that it can be used with libcman. With a dynamic libcman, the same clvmd should be able to run with a kernel or userland cmand depending on the library installed. When libcman is packaged I will apply this to the main clvmd sources Add CMAN_REASON_PORTOPENED callback reason. 2005-09-16 Stanko Kupcevic Fixed bz167217, and handling of DOWNed interfaces 2005-09-16 Patrick Caulfield Fix up the port notification and add new new PORTOPENED notification because we can. 2005-09-16 Stanko Kupcevic Added watchdog that reboots if clurgmgrd crashes 2005-09-15 Patrick Caulfield Use ccs as the repository of config & node information (that is, after all, why it is there!) rather than passing it around in extraneous cluster messages. This also has the effect that we can re-read CCS if a node joins the cluster with a higher config version than us, making "cman_tool version" redundant. That should please some people. 2005-09-15 alan Now all the basic pieces to bug 132 are in place - except for a config option to test it with. This means there are three auto-join modes one can configure: none - no nodes may autojoin other - nodes other than ourselves can autojoin any - any node, including ourself can autojoin None is the default. CVS patchset: 7559 CVS date: 2005/09/15 03:59:08 --HG-- extra : convert_revision : 3fa6a00adecea9f14439b0dc8d44cf771d449844 2005-09-14 A. J. Lewis Initial commit of fsck for GFS2 o this is based on the fsck for GFS1 with changes to handle the ondisk format changes introduced in GFS2 o This is a work in progress - it will not throw errors on a clean filesystem, but it has not been extensively tested yet, so be careful. 2005-09-14 Patrick Caulfield Strip down barriers so they use the VS features of AIS. 2005-09-14 Benjamin Marzinski fixed acl code so that acls are displayed when enabled, and not displayed when disabled. Also cleaned up the intenting of my last checkin. When you copy an suid root file to gfs, you start a transaction on the write and then make a vfs call that eventaully tries to start another transaction for changing the file attributes. This cause gfs to print a warning and not get the attributes right. After this change, if you already have a transaction started in gfs2_setattr_simple, instead of failing the warning assert, you simply use the existing transaction. 2005-09-12 Stanko Kupcevic Fixed bz167769: fs.sh doesn't do 10 & 20 OCF_CHECK_LEVEL checking 2005-09-12 davidlee Default 'localstatedir' is poor: detect and warn CVS patchset: 7531 CVS date: 2005/09/12 11:29:54 --HG-- extra : convert_revision : 96830448990087f4d4d14b516a64ef2b0be5e86a 2005-09-10 alan Fixed a spelling error in an error message. CVS patchset: 7516 CVS date: 2005/09/10 17:55:25 --HG-- extra : convert_revision : 66af1f4f033c1935fcad384ab45888f8797a345a 2005-09-08 A. J. Lewis o Make sure the link counts of directories are properly incremented 2005-09-08 davidlee Remove the "libltdl.tar" aspects of the July 10th update ("--enable-local-ltdl" etc.). That update was basically good, but the "libltdl.tar" aspects seem to have been superfluous. Worse, they were also damaging in various contexts (e.g. when srcdir and builddir are different). The principal files affected are "configure.in" and "bootstrap". But there are small associated administrative changes also in "Makefile.am" and ".cvsignore". [For the record, the detailed discussion is in the thread "libltdl bugfix and improvement" on the "linux-ha-dev" list, August 26th to September 5th.] CVS patchset: 7503 CVS date: 2005/09/08 11:14:49 --HG-- extra : convert_revision : 7d20d51639f709ce1429326c9c4f1bab391e332c 2005-09-08 David Teigland remove bio counters that were kept by the diaper device 2005-09-07 msoffen Added HA_LOGDIR constant and changed heartbeat to use it instead of hard coded /var/log CVS patchset: 7499 CVS date: 2005/09/07 15:38:38 --HG-- extra : convert_revision : da03de43e9a8d797cc25fdfcf3d941d5603c2e04 2005-09-07 horms More verbose reporting of missing SNMP libraries. Make sure DHA_D is defined CVS patchset: 7496 CVS date: 2005/09/07 09:23:14 --HG-- extra : convert_revision : 2a77c6fac98645c3cc103dcb62ccc568c8e50089 2005-09-07 David Teigland remove extra tab put back previous insmod printk we dumped the gfs endian conversions and use le everywhere 2005-09-06 David Teigland misc style stuff define GFS2_FSNAME_LEN to use in place of 256 fill in the standard stuff to send plock_get to user space apply same fixes here from gfs comments: enums, better assert replace PRIx64 with llx tidy line breaks remove "get_cookie" ioctl, not used any more tidy recurse_check replace more defines with enums more trivia, replace define with enum 2005-09-05 horms Fix for building SNMP subagent on SuSE CVS patchset: 7486 CVS date: 2005/09/05 10:27:35 --HG-- extra : convert_revision : 98633161819e57b30f1cdb1821776bcc222573eb 2005-09-05 David Teigland more gfs2_assert munging more munging of gfs2_assert get rid of glock_hold/glock_put, use gfs2_glock_hold/put everywhere Don't reset atomic statistics counters to zero when they roll into negative numbers, just cast to unsigned. AFAICT 'gfs2_tool counters' shouldn't need any change. Pass all gfp flags to gfs2_holder_get() instead of having GFP_KERNEL added to what the caller provides. Replace TRUE/FALSE with 1/0. This is a really unfortunate blow to code comprehension but kernel lords are always right. slim down the printk's in assertions 2005-09-05 msoffen Added HA_VARLOCKDIR CVS patchset: 7483 CVS date: 2005/09/05 03:03:23 --HG-- extra : convert_revision : 55f7748b68d0527e50a8db0063405e9501136aaa 2005-09-03 David Teigland patch from Mike Christie replacing PRI defines don't need to include smp_lock.h no more oopses_ok no more oopses_ok option remove oopses_ok option since a panic results regardless, no difference remove unused todo option in assert macro 2005-09-02 David Teigland replace gfs's switchable endian conversion functions with plain old le conversions 2005-09-01 msoffen Fixing Alan's fix so that the check of libnet.h works ( It was missing definitions for the big endian vs little endian on FreeBSD CVS patchset: 7464 CVS date: 2005/09/01 19:20:58 --HG-- extra : convert_revision : badaa59c8b4cd370f9205f76433b10d39c02f1c4 2005-09-01 Lon Hohberger Add VMWare ESX server fencing from Zach Lowry Ensure proper linking Fix Joe Orton's comments re: calling ld instead of gcc -shared... 2005-09-01 Patrick Caulfield comment & message tweaks Today's openais tarball has some important fixes in it. Don't untar the ais source on every build 2005-09-01 David Teigland remove old patches 2005-09-01 davidlee The 'use any uuid' list shouldn't include things we cannot yet use. CVS patchset: 7458 CVS date: 2005/09/01 09:59:11 --HG-- extra : convert_revision : f3dd67619a0da72aafe9cf0d330c372fbde2d36a 2005-09-01 David Teigland update for use on -mm kernels update for new plock header update kernel patch generating target 2005-09-01 Patrick Caulfield Temporarily enforce static node IDs until this is sorted out. 2005-09-01 David Teigland revert accidental change use macro to define sysfs attributes Allow "id" for the lockspace/fs to be set through sysfs; used for matching plock responses with right ls/fs (plock requests and responses for all fs's multiplexed through one device) Get plock requests from kernel through misc device. For the moment just pass back "ok" immediately. Pass plock requests to user space lock_dlmd through reads on misc device, lock_dlmd passes back results through writes on the same device. get to compile on 2.6.13 2005-08-31 alan Put in an extra test for libnet - make sure headers are installed... CVS patchset: 7456 CVS date: 2005/08/31 20:15:05 --HG-- extra : convert_revision : d000cfed65d03b5e81a95d2367855379f32c1d8f 2005-08-31 Lon Hohberger Fix 167216 -- ip.sh script errors 2005-08-31 alan Fixed a bug in the db2 script which made it not work with newer versions of db2. The bug was assuming that ps output had the db2 command names last on the line. CVS patchset: 7454 CVS date: 2005/08/31 17:42:49 --HG-- extra : convert_revision : f62dc40a414d21492ee122acaf7ee0f133e76e91 2005-08-31 David Teigland do nothing in withdraw (instead of BUG) until the dm hooks are ready use sysfs instead of procfs for list/freeze/withdraw margs/lockdump are not implemented yet Replace the procfs hooks for list/freeze/withdraw with sysfs hooks. The margs/lockdump hooks require something different and are left unimplemented for the moment. fix compiler warning about ignoring return value of inode_setattr 2005-08-30 Patrick Caulfield Configure AIS bit using CCS 2005-08-30 David Teigland These will replace proc.[ch] and do the same thing through sysfs. 2005-08-30 Patrick Caulfield Remove send_queued_events() as it's not used any more. Add IPv6 support, with pre-release AIS code. 2005-08-30 David Teigland follow_link now returns void* 2005-08-29 horms Fix RPM MIB build problem. Bugzilla 843. CVS patchset: 7438 CVS date: 2005/08/29 02:05:54 --HG-- extra : convert_revision : d56a3e29715644bebe236b8634031c2bce694f2d 2005-08-29 sunjd bug799: move rsctmp to /var/run/heartbeat CVS patchset: 7437 CVS date: 2005/08/29 02:04:18 --HG-- extra : convert_revision : 01ebfbbf511354fe7eca1a94b58eb74cbaf46230 2005-08-26 msoffen Fixed check for ncurses ( on freebsd they are in subdirectories of include ) CVS patchset: 7426 CVS date: 2005/08/26 17:55:28 --HG-- extra : convert_revision : d8637a19b9940371c0509479662e1cf26a228ea6 2005-08-26 davidlee Move some AC_ checks earlier (as suggested by autconf docs) to prevent some occasional linking problems CVS patchset: 7424 CVS date: 2005/08/26 14:02:23 --HG-- extra : convert_revision : 1e4e042a3ab6b501b688030f0f4af872db3ed7d3 2005-08-25 Patrick Caulfield Cope with large (>PIPE_BUF) messages coming back from the daemon. Line up heading 2005-08-25 David Teigland clear node struct before every cman_get_node 2005-08-25 Patrick Caulfield Only ask for POLLOUT notification unless we have something to send. 2005-08-25 David Teigland memset node struct to 0 before cman_get_node comment out CMAN_CMD_ADD_KEYFILE which isn't in header yet 2005-08-24 davidlee lib/crm/common/iso8601.c portability (with configure.in) CVS patchset: 7405 CVS date: 2005/08/24 16:43:36 --HG-- extra : convert_revision : 7858c8b2782571bd67842387a8bd9b92c29d74f5 2005-08-24 horms Rationalis SNMP configure CVS patchset: 7395 CVS date: 2005/08/24 06:25:58 --HG-- extra : convert_revision : b04411b2fb7346e197aa518a0ec16fdb4a1ed2a0 2005-08-23 Patrick Caulfield Add support for AIS security key. Add support for AIS security key. Tidy IP specifications. Kill cmand if we fail to set its parameters properly. 2005-08-23 sunjd fix the issue of handling netmask parameter incorrectly CVS patchset: 7390 CVS date: 2005/08/23 06:35:41 --HG-- extra : convert_revision : 77a734a9de0697e0316e23aa0b99d5f2d892baf9 2005-08-23 zhenh Done By Sun Xun. Changes are: *removed hard-coded /var/run *added some cl_log() calls on argument errors *validate-all (Bug #417) *misc cleanup CVS patchset: 7389 CVS date: 2005/08/23 04:31:35 --HG-- extra : convert_revision : fde936c51fb20d9ea9c5ddccdcbe217f5cec56c5 2005-08-22 David Teigland remove PRI/SCN defines that aren't used hold configfs subsys lock while accessing the children list 2005-08-19 sunjd resolve the issue that ping timeout option may be invalid, found&resovled by Horms and Sun Xun CVS patchset: 7379 CVS date: 2005/08/19 14:45:08 --HG-- extra : convert_revision : 72ea795ac7f94b15db1ee40d1326fe1690abec81 remove printw prototype warning, done by Sun Xun CVS patchset: 7377 CVS date: 2005/08/19 09:57:42 --HG-- extra : convert_revision : 5f29fa67ec6375267447512b529b55b84f8067b7 2005-08-19 David Teigland Add printk log levels by replacing printk with one of: fs_info, fs_warn, fs_err. This also lets us do the standard prefix "GFS2: fsid=%s:" in the macro instead of repeating it everywhere. 2005-08-18 andrew Typo spotted by "anymous" on IRC. CVS patchset: 7372 CVS date: 2005/08/18 13:54:42 --HG-- extra : convert_revision : b483c112b6ee505aba8bd8ebb21581cbba4b498d 2005-08-18 David Teigland select configfs use linux/jhash.h instead of our own member_sysfs.c has been decimated and is no longer related to members, move what's left into lockspace.c 2005-08-18 sunjd * validate-all * removed unused code * misc cleanups CVS patchset: 7370 CVS date: 2005/08/18 04:40:57 --HG-- extra : convert_revision : db3cc76c1da12f1e974ea84fc4a2060952bbc888 2005-08-18 David Teigland update use new schedule_timeout_interruptible remove temp defn of kzalloc xattr_acl.h no longer exists, don't include move to schedule_timeout_interruptible, remove set_current_state try to tidy breaks in long lines in daemon.c 2005-08-17 David Teigland munge whitespace to match upstream so patches are sane dlm_our_nodeid now from config.h functions no longer exist rmdir, not unlink, to remove node from lockspace fix problem with the previous change for multiple addresses support more than 1 address per node, up to DLM_MAX_ADDR_COUNT (3) - when weight isn't set, it should default to 1 - check for error from looking up a node's weight set node weights (now through configfs) 2005-08-17 alan Fixed version number to 2.0.1 CVS patchset: 7346 CVS date: 2005/08/17 04:32:53 --HG-- extra : convert_revision : c13aa659a9b6ace5895059ef123e28d46a5f6c98 2005-08-17 gshi add openais communicaiton module CVS patchset: 7344 CVS date: 2005/08/17 04:18:51 --HG-- extra : convert_revision : 6b1f0f6df95b7568499f46e7a0bd142520ffa623 2005-08-17 David Teigland Move setting of the lockspace id from configfs back to sysfs where it's simpler and cleaner. 2005-08-16 David Teigland Configure node addresses through configfs instead of ioctls. don't compile dlm_tool, configfs changes mean it won't be working for a while Configure lockspace id and members through configfs instead of sysfs. Configure lockspace id and members through configfs instead of sysfs. (Compiling requires which is in -mm kernels.) depends on IPV6 2005-08-12 sunjd * simple validate-all * remove unused code(per Roberto Nibali) * turned to OCF compliant return/exit code * do not require OCF instance parameters when doing "meta-data", "usage", "methods" * misc cleanup CVS patchset: 7329 CVS date: 2005/08/12 05:56:01 --HG-- extra : convert_revision : 6dd2454ddbb0d814eb283a0d4b6232b1b5d49874 ( Done by Sun Xun ) * validate-all * OCF compliant return/exit status * fixed a logic error which makes arg empty if neither OCF_RESKEY_config or OCF_RESKEY_port is specified, when assigning to arg CVS patchset: 7328 CVS date: 2005/08/12 05:52:57 --HG-- extra : convert_revision : 6cb50574b3d6747b1a521bb43e6da93cac3ae8b8 (Done by Sun Xun ) * validate-all * check for service status at the beginning of xup_start() and xup_stop() * dont require OCF instance parameters when doing "meta-data", "usage" CVS patchset: 7327 CVS date: 2005/08/12 05:50:01 --HG-- extra : convert_revision : bd54ea61fdc51d917b92c042dbd6bc3a1539cd06 (Done by Sun Xun ) * a simple validate-all * OCF compliant return/exit codes * fixed a typo in Clear_bufs(): $2 -> $1 * donot require OCF instance parameters when doing "meta-data", "methods", and "usage" CVS patchset: 7326 CVS date: 2005/08/12 05:44:24 --HG-- extra : convert_revision : 8c00fdd982a696b5af99d924637379ac7b7d0eed 2005-08-11 Patrick Caulfield Unbind connections when they die. Recalculate quorum when we join. Clear struct before calling cman_get_node() Clear node struct before passing it into cman_get_node() Return node addresses. 2005-08-11 David Teigland remove empty kerneldoc headers copyright artwork copyright artwork remove comment I added cool down on the copyright artwork remove functions unused now that diaper is gone gfs2_ip2v(ip, NO_CREATE) -> gfs2_ip2v_lookup(ip) gfs2_ip2v(ip, CREATE) -> gfs2_ip2v(ip) 2005-08-10 Patrick Caulfield small Makefile fixes library commits that go with the last lot. I'm not sure why CVS missed them out. Build cman against openAIS's libtotem_pg. This is still pretty unstable stuff so be warned. For the moment it downloads a prepackaged/patched version of openais from my web site. This /will/ change. There's still a lot of work to do on this code but is basically works with a few caveats: 1. Barriers are completely untested and may not work at all. 2. Don't start several nodes up at the same time, they might get the same node ID(!) unless you used static node IDs 3. Some of the info returned by cman_tool is wrong. 4. The exec path for cmand is hard coded (in the Makefile) to ../daemon/cmand so you must currently always run cman_tool from the dev directory unless you change it. 2005-08-10 lars USE_OPENIPMI needs to be disabled in case OpenIPMI is not available in the right vesion. CVS patchset: 7280 CVS date: 2005/08/10 10:52:30 --HG-- extra : convert_revision : 0d37c55eadb82f4428f5d02fdfd43f35d1d3d425 2005-08-10 David Teigland Remove diaper code, add a FIXME in the place where we need to hook into dm for withdraw. replace 0x7FFFFFFFull with MAX_NON_LFS in size checks 2005-08-10 horms Currently this code has only conpiles with OpenIPMI 2.X CVS patchset: 7274 CVS date: 2005/08/10 08:57:44 --HG-- extra : convert_revision : f234376cb0199f928f3a04c083e5dacf27ece560 2005-08-10 zhenh add remote python call feature CVS patchset: 7267 CVS date: 2005/08/10 08:05:57 --HG-- extra : convert_revision : 7171b5f17751a78a902c7459bf334ab9eac0167d 2005-08-10 David Teigland c99 initializers fix whitespace damage where lines start with 1-7 spaces followed by a tab replace 256 with defined MAX_LINE fix whitespace damage replace kmalloc/kzalloc with kcalloc where appropriate use kzalloc instead of kmalloc+memset (FIXME: kzalloc added to gfs2.h since it's only in mm at the moment) asm headers after linux convert vma2state from macro to static inline function convert lops.h macros into static inline functions comment to link flags to struct member Remove context dependent path names. Requested by Al Viro. If they're important we can try to put them back once gfs is accepted. 2005-08-09 David Teigland remove another { } block - remove empty kerneldoc headers - tidy util.h and remove some unused bits - remove Ren & Stimpy quote due to complaint get rid of fixed_div64.h -- the existing do_div() works fine in my tests, and do_mod() isn't needed since do_div() returns the modulus. Use wait_event() in do_lock_wait() instead of managing the waitqueue ourselves. I have some doubt that the log spinlock really needs to be held while testing the atomic counter, but I'm not certain enough to drop it. style munging: get rid of { } blocks within functions more line break and over-80 cleanups replace the hash functions with linux/jhash.h get rid of unused get_time() Get rid of RETRY_MALLOC entirely, although the one place it couldn't be worked around still has a loop. 2005-08-08 David Teigland remove { } creating code block within function due to complaint callers of inode_create() can deal with error, don't need RETRY_MALLOC Remove all memory debugging per lkml comments; a pity, this stuff was nice and simple. Maybe we can try to put it back once gfs is merged. gfs2_disk_hash.h contains gfs2_disk_hash() and crc table that's been removed from gfs2_ondisk.h for gfs2_disk_hash() use the kernel's crc32_le() instead of our own function and crc table 2005-08-05 gshi add compression capability CVS patchset: 7235 CVS date: 2005/08/05 19:40:13 --HG-- extra : convert_revision : 7dd91416cbacd06c89d1f071bd191cf9bb4d94a7 add pils library to the link option for all binaries because clplumbing will depend on pils library CVS patchset: 7233 CVS date: 2005/08/05 19:34:54 --HG-- extra : convert_revision : c169575b4e5c671301ef4677a8cd025ceaa5e271 2005-08-05 David Teigland replace gfs2_sort() with sort() from linux/sort.h gfs2_random() is not used, remove it. 2005-08-03 horms standardise confdir in scripts CVS patchset: 7212 CVS date: 2005/08/03 14:37:07 --HG-- extra : convert_revision : 2fb9df21beb583f7e1fd86b8bc05d8ab384e9c02 2005-08-03 David Teigland drop unnecessary casts of void pointers spaces to tab 2005-08-03 sunjd ( Done by Sun Xun ) Add the check to mdadm for Raid1.in CVS patchset: 7194 CVS date: 2005/08/03 05:56:10 --HG-- extra : convert_revision : ed6e252d15ff47a1b11726a308edcd139b1deac8 *validate-all *OCF compliant return/exit status *now works with both raidtools and mdadm(some distros dont have raidtools, eg FC4) *now works even if more than 10 MD devices *does not require OCF instance parameters when doing "meta-data", "usage" *removed outdated comment CVS patchset: 7193 CVS date: 2005/08/03 05:53:38 --HG-- extra : convert_revision : 4f287f671d5e642dd5a81261e104cb9194e80b2e 2005-08-03 David Teigland Excerpt from Ken Preslan's "ramblings". 2005-08-02 David Teigland __user annotation in gfs2_readlink() __user annotations for proc functions adding static to a bunch of stuff found by sparse, and a __user add static to acl_get() 2005-08-02 Adam Manthei Add support for Dell PowerEdge 1855 to fence_drac (bz 150563). When using the DRAC/MC firmware, an addition parameter is required to specify the module/blade in the PE 1855. 2005-08-01 A. J. Lewis Update fsck in HEAD of CVS with changes made to RHEL4 and STABLE branches - large forward-port of changes * Whitespace changes * Set error variable before use * Remove duplicate allocation code * Don't use the BH_DATA macro when assigning * set error before using it * Display messages before, during, and after clearing journals o Currently displays a '.' every 10 journals cleared - this messes with the log_info message when -v is used, but I'm not going to worry about it for now * Update Makefile * Fix for bug #160525 - The fsck wasn't handling extended attributes properly - if it found one, it would read an invalid variable instead of the one pointing to the eattr block. * Create log_at_* macros o log_at_* only prints the message if the default level equals the priority of the print request * use log_at_notice for '.' printing in journal "replay" * Turn some non-static functions static * Add newline to message * Add a function to write out a new superblock o writes out the superblock contained in sbp->sb * Prevent new mounters by changing the lock protocol o Change the lock protocol from lock_* -> fsck_* on startup o Change it from fsck_* -> lock_* on teardown o Currently requires gfs_tool to fix if the fsck fails partway through * Fix minor warning in super.c * Fix for bz #160835 - fsck erroneously reporting bitmap corruption o converting free metadata -> free data should not generate errors o this adjusts the way the fsck checks the bitmaps so it does not, it just notified the user that the block is getting converted from free metadata to free data. * Fix up build_metalist to work properly nbh wasn't getting zeroed properly in the loop that was using it, resulting in bad values getting passed in if check_metalist didn't set it. * fix up check_eattr* calls and add debug info check_eattr_indir and check_eattr_leaf weren't setting bh properly added some log_info/log_debug calls to help track down what's going on 2005-08-01 zhenh check the status of ip just after using ifconfig add it CVS patchset: 7177 CVS date: 2005/08/01 10:52:47 --HG-- extra : convert_revision : a6020de1df2d0afe01890340e7c0f75a15a04f77 2005-08-01 sunjd ( Done by Sun Xun ) * turn some return/exit values to OCF compliant * validate-all, some code copied and modified from IPaddr RA * do not distinguish "status" from "monitor", the original implementation is self-contradicted * do not require OCF_RESKEY_ipaddress to be set when doing "meta-data", "usage" CVS patchset: 7176 CVS date: 2005/08/01 10:17:03 --HG-- extra : convert_revision : 5764ee39aab2a6cfbd9050719cf867d35f2f6668 correct a typo CVS patchset: 7175 CVS date: 2005/08/01 10:12:20 --HG-- extra : convert_revision : 36b1e26306cfe309889cfbf09395e7f9c4f303a5 2005-08-01 David Teigland Go back to schedule_timeout() in the daemons. We want to wake up more often than the timeout, which schedule_timeout() typically does. msleep won't return until the timeout at a minimum which makes unmount take a long time. inline instead of __inline__ more tidying, removing old or unnecessary comments 2005-07-29 alan Continuing trying to figure out why dobeam doesn't find the glib2 libraries. CVS patchset: 7160 CVS date: 2005/07/29 18:41:38 --HG-- extra : convert_revision : bf9980705ddc416b4cd598bbaab169c13b9dcff6 Trying to figure out why we can't seem to find the glib libraries - but only when I'm running the beam tests... CVS patchset: 7159 CVS date: 2005/07/29 18:29:56 --HG-- extra : convert_revision : 7cfdafa0bdc0304964cda873831c109216685e11 Updating version number to 2.0.0 CVS patchset: 7158 CVS date: 2005/07/29 17:11:32 --HG-- extra : convert_revision : 3fe6b0f95c2144c1b40b216a6f7c1c92e268fdb8 2005-07-29 Lon Hohberger fix 164627 fix 162501 2005-07-29 sunjd bug668: license update CVS patchset: 7147 CVS date: 2005/07/29 07:36:15 --HG-- extra : convert_revision : d764d0b68e32740647f86d984e369cf6c438273d bug668: license update CVS patchset: 7139 CVS date: 2005/07/29 06:12:29 --HG-- extra : convert_revision : 951fe0823707bd9bda882cb738e79fce7bc1096b 2005-07-29 David Teigland Add a comment with Ken's explanation of the diaper device. In conversation, simply refer to "GFS", not "GFS2". 2005-07-29 Jonathan Brassow - s/uint32_t/sector_t for get_region_size return - Fix cmirror bugs - Only allow one machine to perform resync work There must be a race in there somewhere when allowing multiple simultaneous recoveries... This side-steps that issue. Besides one machine moving the heads is probably better than all. - when a resync is done, we must mark it in the clean list as well as the sync list. Otherwise, the log will always tell us there is recovery to be done - even if there isn't. - Only suspend the server on cluster events - not dm suspends. If we flush the clients before returning from the suspend, the server will do nothing anyway. - fix typo - clean up logging - take the code that connects to cman out of the init function and do it on first mirror activation. This way, you don't have to have cman running in order to load the module... Makes mirror activation a bit slower, but speeds up module loading. 2005-07-28 Adam Manthei fix for bz 161352 Adds support for latest ilo firmware version (1.75). Changes were also added to make sure that power status of the machine is being properlly checked after power change commands have been issued. Before, it was assumed to work through the messages recieved in the xml response. As a result, systems with APCI enabled would soft power off leaving, systems running instead of properlly fencing them. A new option, ribcl, was added to the agent to force the protocol to use when trying to send the power off command. Setting this value to "2.21" might be required to work around some versions of ilo on machines with APCI enabled. If left blank, the protocol version is autodetected. 2005-07-28 Lon Hohberger Fix 159767 Fix 163651 2005-07-28 David Teigland remove trailing whitespace replace spaces with tabs add file for kernel patch include gfs2.txt in kernel patch depend stuff The max num_glockd was lowered from 32 to 16 when we switched to using the kthread routines. To avoid compile warning, replace get_v2ip(aspace) = NULL with set_v2ip(aspace, NULL). Not sure how Ken overlooked this one when replacing all the others... 2005-07-27 Adam Manthei o make init.d/fenced exit with WARNING instead of FAILED when using gulm cluster management (bz 159685) o more initscript clean ups (bz 155478) 2005-07-27 Chris Feist file dm-log.h was initially added on branch RHEL4. file dm-cmirror-xfr.h was initially added on branch RHEL4. file dm-cmirror-xfr.c was initially added on branch RHEL4. file dm-cmirror-server.h was initially added on branch RHEL4. file dm-cmirror-server.c was initially added on branch RHEL4. file dm-cmirror-common.h was initially added on branch RHEL4. file dm-cmirror-cman.h was initially added on branch RHEL4. file dm-cmirror-cman.c was initially added on branch RHEL4. file dm-cmirror-client.c was initially added on branch RHEL4. file Makefile was initially added on branch RHEL4. file uninstall.pl was initially added on branch RHEL4. file release.mk.input was initially added on branch RHEL4. file defines.mk.input was initially added on branch RHEL4. file configure was initially added on branch RHEL4. file TODO was initially added on branch RHEL4. file README was initially added on branch RHEL4. 2005-07-27 panjiam changed license statements CVS patchset: 7114 CVS date: 2005/07/27 10:16:22 --HG-- extra : convert_revision : e833a87016b165d32711b8017f95d1f1770ed794 2005-07-27 David Teigland include "locking/harness/lm_interface.h" generate new kernel patches adds gfs2 to kernel build used to generate kernel patch where we need quotes around lm_interface include get lm_interface.h from . get lm_interface.h from ../harness 2005-07-27 sunjd ( Done by Sun Xun ) *validate-all *don't specify path to 'mail', since the path may vary between systems *don't require OCF_RESKEY_email when doing 'meta-data', 'usage' CVS patchset: 7111 CVS date: 2005/07/27 08:26:57 --HG-- extra : convert_revision : 03e8419bf2164be668a1bc561428d0edbd3af164 ( Done by Sun Xun ) * validate-all (bug 417) * incorporated Horm's patch dealing with different LVM versions * do not require OCF_RESKEY_volgrpname when doing "meta-data", "methods", "usage" CVS patchset: 7109 CVS date: 2005/07/27 08:10:40 --HG-- extra : convert_revision : a5f31291e9ebd34983e34a7c3862b44479a30f15 2005-07-27 David Teigland typo bug "&atomic_read()", remove & add 2005 to copyright reorder/indent for readability remove unused options remove define parens fix typo bug from list macro conversion 2005-07-26 David Teigland remove/change comments that are out of date or unnecessary 2005-07-26 Daniel Phillips add interface to specify initial socket as ascii fdnumber to avoid exporting sys_socket and sys_connect 2005-07-26 David Teigland replace __inline__ with inline use msleep() instead of schedule_timeout() Only allow one of the lock_dlm threads to do blocking callbacks. It appears that gfs2 will wait for a completion callback within a blocking callback, so one thread must always be available to do completions. 2005-07-25 David Teigland use kthread functions don't build unused debug header use list_for_each_entry munging to fix lines over 80 and other odd line breaks (leave a bunch over 80 that can't be trivially fixed) depends on SYSFS Signed-off-by: Adrian Bunk 2005-07-22 sunjd (Done by Sun Xun ) *a simple validate-all implementation *removed obsolete word from usage section *exit immediately on errors in parseinst() and other place *changed return/exit values to $OCF_* *do not require OCF instance parameters when doing "meta-data", "methods", "usage" CVS patchset: 7093 CVS date: 2005/07/22 07:45:37 --HG-- extra : convert_revision : 9db462f1a3f547ad0787e1c68b2f434e710af02a Let CheckInterval() accepts only positive integer, done by Sun Xun CVS patchset: 7092 CVS date: 2005/07/22 07:40:52 --HG-- extra : convert_revision : 613ffbb964d70e4fa1864d610496e5fbdb140702 2005-07-22 David Teigland Remove ENTER/RETURN macros used for profiling and tracing debugging. 2005-07-21 David Teigland remove more define parens remove GFS2_RELEASE_NAME remove parens from defined values, fix lines over 80 function type and name on same line is preferred style 2005-07-21 sunjd resolve the EAGAIN issue CVS patchset: 7085 CVS date: 2005/07/21 07:50:08 --HG-- extra : convert_revision : ed46e4502af87859736b15167d23ffe1e2ed58dd avoid EGAIN in lrmd: if the machin run fast enough, then the backgound process even doesnot finish the stdout redirecting when the RA exit CVS patchset: 7084 CVS date: 2005/07/21 07:22:11 --HG-- extra : convert_revision : de643984a24491547a3f75a5c2278ba7674e37ca 2005-07-21 David Teigland cleanup and tidying - remove parens around defined values - use tabs where appropriate - prune excessive commenting (some of which may not apply to gfs2) Carry out Ken's instructions from gfs2_ondisk.h to change the ondisk format: make mh_type and mh_format into uint16_t make de_name_len and de_type into uint8_t 2005-07-20 Lon Hohberger Fix 159637 2005-07-20 David Teigland fix mistake from converting to list_for_each 2005-07-20 Jonathan Brassow - do not call completion on every suspend, it increments a counter which causes wait_for_completion()'s to proceed w/o waiting. 2005-07-19 Jonathan Brassow - don't call complete() on failure_completion if we are in "core" mode. 2005-07-19 sunjd Bug 417; Done by Sun Xun *validate-all *add Author, Copyright, License *removed unused var BlockOrUnblock *do not hardcode 'iptables' path *don not require parameters when doing "meta-data" or "usage" CVS patchset: 7073 CVS date: 2005/07/19 09:50:49 --HG-- extra : convert_revision : d7b78da364cea0450a04d64007489753da05e936 To avoid issues when called by lrmd, redirect stdout->stderr. CVS patchset: 7071 CVS date: 2005/07/19 07:39:16 --HG-- extra : convert_revision : 659985ef491a41adecf5bf83f3a857866944687d 2005-07-19 David Teigland tidying and filling out comments 2005-07-19 sunjd avoid EAGAIN in reading pipe in lrmd which implies a process may be teminated by SIGPIPE. Redirect to stderr so users can read the output when use it manually CVS patchset: 7070 CVS date: 2005/07/19 06:35:29 --HG-- extra : convert_revision : a94aa62dd90a3999b749acd3b3d95f2ac6e868a4 2005-07-19 David Teigland function type and name on same line, use list_for_each_entry kernel people like function type and name on same line Significant cleanup, lots of style stuff 2005-07-18 horms Fix for CAN-2005-2231 temporary file vulnerabilities CVS patchset: 7060 CVS date: 2005/07/18 20:19:45 --HG-- extra : convert_revision : 4f79bec1f3f8c9f06a47bd0d1751647dae34086d 2005-07-18 gshi typo found by Roberto Nibali and David Lee CVS patchset: 7057 CVS date: 2005/07/18 16:36:41 --HG-- extra : convert_revision : 723dced2d646bafc2776411bef133c1d2284d60e 2005-07-18 sunjd Bug 417: add validate-all action; cleanup meta-data information. Done by Sun Xun CVS patchset: 7049 CVS date: 2005/07/18 15:13:28 --HG-- extra : convert_revision : 00662fe216b29ae49661caef8e619afc72e9a215 Bug 417 and other polish. This is done by Sun Xun - validate-all - exit immediately in parseinst() on failure - change "scsi-add-single-device" to "scsi add-single-device". See http://lxr.linux.no/source/drivers/scsi/scsi_proc.c#L255 for why. - add explicit return value of scsi_status() - don't require OCF_RESKEY_scsi when doing 'methods', 'meta-data', 'usage' CVS patchset: 7048 CVS date: 2005/07/18 15:07:16 --HG-- extra : convert_revision : fe91a46a3bd60d2b59eab8f361c4ae44692d9a2c 2005-07-15 alan Fixed some bugs regarding apache version 2. Need to process config files recursively. CVS patchset: 7024 CVS date: 2005/07/15 16:39:53 --HG-- extra : convert_revision : 53fdc47ff276b7acef8159f5805ce4f9e0ce64d9 Fixed a bug where /usr/lib/ocf was set from wrong variable. CVS patchset: 7023 CVS date: 2005/07/15 16:36:14 --HG-- extra : convert_revision : d430d69033abd4a30ddba43f7349e36f25146029 Changed the default apache config name for the OCF resource agent... CVS patchset: 7019 CVS date: 2005/07/15 15:06:10 --HG-- extra : convert_revision : d755b6a510c61001fb3dee2aa25af21584c386ec Changed the default pid file, and also allowed for the existence of apache2... CVS patchset: 7018 CVS date: 2005/07/15 14:46:49 --HG-- extra : convert_revision : 45fe588362db8a8b74332c7b80c64e51072a5de3 2005-07-15 sunjd licens update; use ocf_log instead CVS patchset: 7017 CVS date: 2005/07/15 09:51:56 --HG-- extra : convert_revision : d74efb7ac8443a18ad11e10333e5053a7271d7ec correct misuse to ocf_log; return value,license update CVS patchset: 7016 CVS date: 2005/07/15 09:51:15 --HG-- extra : convert_revision : cefa6f324fb94f76020fbf6e70cd74e72f5ce42f license,support update; replace 'echo' with ocf_log CVS patchset: 7015 CVS date: 2005/07/15 09:44:32 --HG-- extra : convert_revision : b206388ea64d2fad4539ef5e88fb49db018062d9 license,CR update; replace 'echo' with ocf_log; use OCF_* instead CVS patchset: 7014 CVS date: 2005/07/15 09:43:25 --HG-- extra : convert_revision : 8ac022f340a83de7e8ecb6a89d8999f591dd4d75 remove the redundant echo after using ocf_log CVS patchset: 7013 CVS date: 2005/07/15 09:28:21 --HG-- extra : convert_revision : 1e9b486272f170f020314d5a48ea11482ba1011d add log output; license, return value polish CVS patchset: 7012 CVS date: 2005/07/15 09:19:46 --HG-- extra : convert_revision : 10f4db7490e04eaf4e3186ac399e267c8544f5d4 correct the misuse of ocf_log; add log output; correct the misuse of OCF_* CVS patchset: 7011 CVS date: 2005/07/15 09:04:03 --HG-- extra : convert_revision : aa0a1c5ac16d5adecd402879c54aef193441735a add some ocf_log; use OCF_* for return value instead CVS patchset: 7010 CVS date: 2005/07/15 08:54:08 --HG-- extra : convert_revision : e215b98a8e25a1917c7b2688c8798d39bc807304 bug647: correct the misuse of ocf_log CVS patchset: 7009 CVS date: 2005/07/15 07:58:59 --HG-- extra : convert_revision : cebb45e19ed9945e293f07c5d91ae636b2a91b0b correct the misusing of ocf_log CVS patchset: 7008 CVS date: 2005/07/15 07:45:58 --HG-- extra : convert_revision : 6266cf926f11c2d0fe441de485de6bb716300371 2005-07-15 David Teigland wait to add second mounter until gfs's initial recovery is done on first mounter (gfs does others_may_mount) 2005-07-15 sunjd log some output to stdout CVS patchset: 7007 CVS date: 2005/07/15 07:38:53 --HG-- extra : convert_revision : f02d90ac9fe460486a9c81c7be84336bdfe9b45d update on license, CR and ocf_log using CVS patchset: 7006 CVS date: 2005/07/15 07:28:08 --HG-- extra : convert_revision : b178be1f178df78553001d72297b2ade8a53475e polish on ocf_log using and license statement CVS patchset: 7005 CVS date: 2005/07/15 07:19:41 --HG-- extra : convert_revision : 3b03d000001275022f4c409797ca8e07b6862ffc bug 417: add the validate-all action; other minor improvements. Done by Sun Xun CVS patchset: 7004 CVS date: 2005/07/15 07:12:18 --HG-- extra : convert_revision : 61ca83c9abf6af958e7589424380e7aab8afc906 2005-07-15 David Teigland - simplify a bunch of old junk - wait for all recoveries to complete before processing any joins/leaves global "joining" var needs to be per lockspace add 2005-07-15 sunjd bug 747: correct the misuse of ocf_log; other minor polish CVS patchset: 7003 CVS date: 2005/07/15 03:24:09 --HG-- extra : convert_revision : 452e112cc9063a8189810fe2a09fff3de3d8d766 2005-07-14 zhenh 1. remove the bad code in is_addr6_available(); 2. add a parameter check CVS patchset: 6998 CVS date: 2005/07/14 14:20:00 --HG-- extra : convert_revision : d780a77d45eaaab51f0faa6cba1acf05b0906e9f 2005-07-14 Patrick Caulfield interesting typo. 2005-07-14 David Teigland use a constant message size between libgroup and groupd -n used with gnbd commands 2005-07-13 Jonathan Brassow - add cluster "core" log support - cluster mirror is working again, but requires patches to kernel 2005-07-12 Lon Hohberger fix 157327 2005-07-12 Patrick Caulfield file saClm.h was initially added on branch STABLE. file saAis.h was initially added on branch STABLE. file clm.c was initially added on branch STABLE. 2005-07-12 alan Put in some changes to set bytes we're sending out to zero to avoid sending out garbage to the net. CVS patchset: 6987 CVS date: 2005/07/12 15:34:18 --HG-- extra : convert_revision : b009ee219784714e53fedac523368b99c5df397c 2005-07-12 Lon Hohberger Apply patch from Eric Kerin to fix #162824, fix #162936 2005-07-12 alan IPv6addr.c did not compile correctly. It used kernel headers directly. This is forbidden. They didn't make much if any use of the libnet headers - which I don't understand. They may also still be not using some of the libnet functions they should be. The WHOLE POINT of libnet is to avoid yucky kernel dependencies, etc. CVS patchset: 6984 CVS date: 2005/07/12 14:46:22 --HG-- extra : convert_revision : 6a559301ef45309911adde77274af21a1baaef89 2005-07-12 Jonathan Brassow - Bring back up to sync with latest mirror changes - New function added (*is_remote_recovering)() - resturcture log dev failure detection - Need to add pre/post suspend support 2005-07-11 Lon Hohberger Fix type causing verify-all to not work properly 2005-07-11 davidlee Correct minor syntax error in 1.409 (my error in email to Andrew) CVS patchset: 6972 CVS date: 2005/07/11 14:17:27 --HG-- extra : convert_revision : 5bbd5fc65870ad92407719b81545777e2568a2d5 2005-07-11 andrew Change the name of the local-ltdl to bundled-ltdl to be clearer Changes suggested by David Lee * Do not compress the archive: -z is a GNU-specific option for tar * Use $(TAR) instead of tar CVS patchset: 6971 CVS date: 2005/07/11 12:34:57 --HG-- extra : convert_revision : fc8b21cf38e6d7162cddd030e0b8da8ad0e496a2 2005-07-11 sunjd Add the support to validate-all action; other minor polish. Done by Sun Xun CVS patchset: 6966 CVS date: 2005/07/11 09:16:35 --HG-- extra : convert_revision : a4ae8ecd3600c8ee280913c16cd82e333be1f8fb Add the support to validate-all action; other minor polish. Done by Sun Xun CVS patchset: 6965 CVS date: 2005/07/11 08:57:16 --HG-- extra : convert_revision : 18dfafdea4ec32b520c6d2371abf1fca3de844d5 2005-07-10 andrew Drive Darwin options from ./ConfigureMe CVS patchset: 6959 CVS date: 2005/07/10 10:58:40 --HG-- extra : convert_revision : 249aebc29f6a6dbe21e8143f61abe97feba8e0c3 Add a --enable-local-ltdl option to configure. * If --enable-local-ltdl=yes, it will always conform to the legacy behavior (build and use a standalone version of LTDL) * If LIBLTDL is NOT installed, it will always build and use a standalone version of LTDL * Otherwise, if LIBLTDL is installed it will use that. Packagers should probably use --enable-local-ltdl to retain legacy behaviour. Though I personally feel it is very bad for heartbeat to be overwritting libraries that are potentially newer than what we supply and belong to another package (libtool). Tested on Darwin, SLES and Debian. CVS patchset: 6957 CVS date: 2005/07/10 10:54:27 --HG-- extra : convert_revision : 332fde2002b6e0e226e0e82b81c41ed02bb2196e 2005-07-08 Lon Hohberger \Fix 162805\ 2005-07-08 alan Removed SWIG CVS patchset: 6944 CVS date: 2005/07/08 14:53:58 --HG-- extra : convert_revision : a8e05082fd096ad3c342dc51a6c5c676dfbf2698 2005-07-08 davidlee Allow OCF directory to be changed if necessary. CVS patchset: 6939 CVS date: 2005/07/08 12:10:37 --HG-- extra : convert_revision : 77de6b93bae3264f69caae456d07cb473c753ad0 2005-07-07 andrew s/interval/update/ so it doesnt conflict with the action option with the same name add the ability to run as a different user - move into C at some point CVS patchset: 6930 CVS date: 2005/07/07 19:15:14 --HG-- extra : convert_revision : e473df1299bfb9845c217c14b4e76268a43413dd New file CVS patchset: 6911 CVS date: 2005/07/07 06:39:03 --HG-- extra : convert_revision : eed152c00903f00432f0a01cd7ca26089066d37d Fix a dodgy first cut. CVS patchset: 6910 CVS date: 2005/07/07 06:38:44 --HG-- extra : convert_revision : 58f2bdafa2d64a8b2eb0e169f074f8e60669f2f3 2005-07-06 Jonathan Brassow - last commit before major changes... don't want to loose this. 2005-07-06 Patrick Caulfield Fix device refcounting 2005-07-06 alan Changed version numbers to make things slightly more sane. CVS patchset: 6900 CVS date: 2005/07/06 14:39:11 --HG-- extra : convert_revision : a92a0f355eced6786230ec0198ddec5a6be3b7fe 2005-07-06 davidlee [Oops. Previous commit had used "-m" rather than "-F" for the text.] CPPFLAGS/LIBS: Setting these as OS-dependencies feels against the spirit of autoconf. Ratehr, they are generally regarded as site-dependencies; sites needing them will already tend to have local conventions for building their autotools applications. (I would recommend the heartbeat custodians of other OSes to consider similar removal, or at least migration to "ConfigureMe".) INIT_EXT: Should be empty for Solaris. CVS patchset: 6893 CVS date: 2005/07/06 11:50:30 --HG-- extra : convert_revision : 0afc9d15bf74fd36359232269bf06d3d8f8858aa /tmp/sol.txt CVS patchset: 6892 CVS date: 2005/07/06 11:33:11 --HG-- extra : convert_revision : 4f28f501c4ddaeac655fb40c29ac19eba027b1c7 2005-07-06 andrew Add an RA that controls crm_mon for producing html status reports CVS patchset: 6890 CVS date: 2005/07/06 09:43:03 --HG-- extra : convert_revision : 63d255060652ca866bb5153bf6f8fa00cd743823 2005-07-06 Adam Manthei add fence_drac to default build 2005-07-05 Michael Conrad Tadpol Tilstra added a warning. 2005-07-05 davidlee improve portability of [n]curses aspects CVS patchset: 6870 CVS date: 2005/07/05 16:31:45 --HG-- extra : convert_revision : 22d1bdda0178cc1c89f644acef875db8bc5e2354 2005-07-04 andrew Safer checks CVS patchset: 6866 CVS date: 2005/07/04 06:22:44 --HG-- extra : convert_revision : 0804d28392a5cd0d14640af529f0959cd6a54226 2005-06-30 Michael Conrad Tadpol Tilstra same sort of compiler fixes that i did in userspace. 2005-06-30 David Teigland Last change was not correct, it's a second mount we need to delay completing until first_done. When we're the first mounter, wait for "first_done" (set by others_may_mount) before completing the start. add "first_done" sysfs file for dlm_controld to be notified of gfs's others_may_mount() 2005-06-29 andrew Check for ncurses CVS patchset: 6815 CVS date: 2005/06/29 11:12:02 --HG-- extra : convert_revision : c07317779f68b13c8e9bf2985da9aa25acac2775 2005-06-28 Adam Manthei o man page for fence_drac o updates to usage to include ccs options Support for Dell Remot Access Card III/XT This replaces the racadm call with direct access to the telnet interface of the DRAC. This will require the user to first enable the telnet interface. The update has been tested on the following hardware. A more comprehensive list needs to be created. The following agent has been tested on: Model DRAC Version Firmware ------------------- -------------- ---------------------- PowerEdge 750 DRAC III/XT 3.20 (Build 10.25) 2005-06-27 Michael Conrad Tadpol Tilstra gcc4 hates me. This fixes an occasional bug where services cannot login to gulm. 2005-06-27 David Teigland ack stop callbacks Add new phase to wait for application acks following a stop callback. 2005-06-24 msoffen Fixed how tarfile for port is created ( now is heartbeat/FILENAME in the tarfile.) CVS patchset: 6779 CVS date: 2005/06/24 17:30:50 --HG-- extra : convert_revision : d800ad13be99edf5432d71de2d2e25dc4587e2b0 2005-06-23 David Teigland fix Makefiles for install When a lockspace on a remote node is not found for a recovery status request, we need to treat this as if it did exist and has a 0 recovery status. Need to release the list of root rsb's when recovery is aborted early. incorrect logic in telling kernel when join/leave was complete file INSTALL was initially added on branch STABLE. 2005-06-22 David Teigland If recover_locks() on an rsb doesn't find any locks to recover, we need to clear the NEW_MASTER flag since it won't be cleared by dlm_recovered_lock(). Also add an assert that NEW_MASTER is set in dlm_recovered_lock(). need to copy the rsb's hash into the remove message Per-lockspace option for dlm to run without using a resource directory. What would be the directory node for a resource, is statically assigned to be the master node instead. - no directory lookups are done which speeds up most new requests - the first node to lock a resource is now unlikely to be the master for it, slowing down other cases - combined with directory weights, the dlm can be configured to run as a "lock server" where the lock master has a weight of 1 and all others have a weight of 0 file VERSION was initially added on branch STABLE. file COPYING was initially added on branch STABLE. 2005-06-21 Benjamin Marzinski Adding Fabio Massimo Di Nitto's patch to keep up with changes in the kernel inode structure. 2005-06-21 David Teigland get to compile on 2.6.12 dlm builds on 2.6.12 2005-06-19 alan IPaddr misused ocf_log - now it should be happy :-) CVS patchset: 6733 CVS date: 2005/06/19 05:05:40 --HG-- extra : convert_revision : 9488e6253a8d726e8385b297ce06e437d976dd60 Fixed a wee bug in ocf_log. (It didn't log the message, just the priority) CVS patchset: 6732 CVS date: 2005/06/19 05:02:12 --HG-- extra : convert_revision : 7dc8d5903d47c0a19d5a0391bb83785ba8d74df9 Added a couple of cases and an error message to the ocf_log function. CVS patchset: 6730 CVS date: 2005/06/19 02:37:50 --HG-- extra : convert_revision : f6289a3649e69fcda651338f1275c3e580810f76 2005-06-17 Lon Hohberger Don't use _syscall macro, patch from Adam Conrad 2005-06-17 zhenh add haresources2cib.py CVS patchset: 6716 CVS date: 2005/06/17 13:46:44 --HG-- extra : convert_revision : 9eb7002b9eebb9fde95a2562cf5d8d87fa0f4ead 2005-06-16 davidlee In "--enable-snmp-subagent=..." the "try" value, added around May 27 2005, was not fully working. It should be OK now. Macro "LIB_SNMP" can be envisaged as a sort of function: input: "--enable-snmp-subagent={yes|try}" result: "--enable-snmp-subagent={yes|no}" This "function" continues to have "side effects" of setting other variables etc. (One of those tidied up and removed was its setting of "SNMP_SUBAGENT_ENABLED". This is only needed by "configure.in" (function caller); it is trivially derivable from the function result.) The internals of that LIB_SNMP macro, in "config/snmp_subagent.m4", needed some tidying to help achieve all this. The main block of library-checking code, and a few separate loose ends that logically belonged with it, have been concentrated into a new (internal use only) macro "LIB_SNMP_LIBS". I would have liked to have made this a shell-function, but this update was already getting large enough. And talking of shell-functions... I see that the LIB_RPM macro is invoked (i.e. fully expanded) twice. Each expansion seems to add close to 1,000 (one thousand) lines of shell script into the resulting "configure". This is surely not ideal. Perhaps a future vistor here might wish to consider making LIB_RPM a shell-function. CVS patchset: 6700 CVS date: 2005/06/16 17:15:44 --HG-- extra : convert_revision : d43bd28f6b90b9692c3620c08264a500f722bd1b 2005-06-16 Patrick Caulfield Add some (hopefully helpful) comments 2005-06-16 davidlee migrate m4 inclusion of 'config/snmp_subagent.m4' from 'configure.in' to new 'acinclude.m4' CVS patchset: 6699 CVS date: 2005/06/16 14:21:02 --HG-- extra : convert_revision : 0d8fa59c2b8393d05cee5048ef4d083550f710d0 2005-06-16 David Teigland Resolve potential recovery problems in dealing with an rsb prior to the master nodeid being confirmed. This simplifies the handling of the master confirmation in general, too. 2005-06-15 Patrick Caulfield use umask so that permissions on /etc/cluster/cluster.conf are -rw-r----- 2005-06-15 Jonathan Brassow - use umask so that permissions on /etc/cluster/cluster.conf are -rw-r----- Thanks to Fabio Massimo Di Nitto for spotting this. 2005-06-15 davidlee common code for syslog facility name/value conversion CVS patchset: 6685 CVS date: 2005/06/15 14:08:37 --HG-- extra : convert_revision : ff71352a9d8331c48e2a0e073b0ec3bbb554c11f 2005-06-15 David Teigland New way of controlling the dlm, no longer mirrors groupd callbacks. different sysfs hooks for controlling the dlm Big rework of lockspace control/management. Simplifies things significantly and removes a lot of code. 2005-06-15 Jonathan Brassow - fix for bug 157094 A mysterious error being generated when trying to do a broadcast (sendto): ccsd[1704]: Unable to perform sendto: Cannot assign requested address On certain clusters (seems to be when ccs tries using IPv6), this error could show up 9 out of 10 times. When the error was received, the broadcast attempt would fail. This caused the attempt to grab any possibly updated cluster.conf files to abort. Waiting a moment, closing the socket, reopening the socket, and retrying the broadcast seems to solve the issue. (It has work 100+ times so far.) I'm not entirely certain what is causing the initial try to fail - perhaps the underlying subsystem is not quite ready... In any case, I have never seen a second attempt fail. 2005-06-14 Lon Hohberger Fix bug in ip.sh which would match 10.1.1.1 as being the same as 10.1.1.111 2005-06-14 davidlee Allow use of inbuilt uuid functionality CVS patchset: 6659 CVS date: 2005/06/14 09:53:21 --HG-- extra : convert_revision : d49299b85d93f4114aef704a6bcce7c7d94be57c 2005-06-14 Daniel Phillips Initial add, ddraid 2005-06-13 msoffen Changes for building FreeBSD portfile. CVS patchset: 6656 CVS date: 2005/06/13 20:16:59 --HG-- extra : convert_revision : 1ad8b6df1feab1ef5fc5a8f1246388ff3039fa10 2005-06-10 Patrick Caulfield Put saved messages on the right lists 2005-06-10 Lon Hohberger Add Patrick's initial fence_xen to fence Xen virtual machines. 2005-06-09 davidlee enable build when builddir different from srcdir CVS patchset: 6607 CVS date: 2005/06/09 14:40:10 --HG-- extra : convert_revision : 49abe17008f9ae6d753322e777fcfed9d34e427f 2005-06-09 David Teigland Replace test_bit(), set_bit(), clear_bit() of rsb flags with rsb_flag(), rsb_set_flag(), rsb_clear_flag() which use the less expensive non-atomic bit operations. c.f. include/net/sock.h 2005-06-09 Patrick Caulfield cmand depends on commands.o not commands.c 2005-06-08 David Teigland Use ccs to get a node's optional weight value. default weight is 1 not 0 Use node weights in directory node mapping. A node is responsible for a portion of the directory equal to it's weight divided by the sum of weights from all nodes. By default a node has a weight 1. All nodes with weight 1 is the standard old behavior. If all nodes have weight 0, all revert to weight 1. 2005-06-08 Patrick Caulfield Fix crash with barriers, caused by overtidying. ccs.h is not really a dependancy 2005-06-08 sunjd make GCC4 happy. Explicit transfer CVS patchset: 6600 CVS date: 2005/06/08 08:27:25 --HG-- extra : convert_revision : 49fa4d5eafd84b0fab41ece0b512eb475a79f39d 2005-06-07 Lon Hohberger Add patch from Frederik Schueler to remove implicit dependency on rdisc 2005-06-06 Michael Conrad Tadpol Tilstra Added some diagnostic messages for when clients/slaves cannot connect to a LT. They're under the Network2 verbosity setting. 2005-06-06 Patrick Caulfield Move a bit more stuff around. Fix Makefile dependancies 2005-06-03 Patrick Caulfield A bit more tidying. Fix some comments 2005-06-03 David Teigland This patch makes needlessly global code static. Signed-off-by: Adrian Bunk look through correct list (members_gone, not members) for recovering node when gfs does recovery_done adjust some log_error() messages and open syslog when daemonized Add two comments in set_master() explaining how things work. When an outstanding lookup is re-processed after recovery, the MASTER_WAIT flag needs to be cleared first or the lkb will be made to wait on the rsb's lookup list, i.e. waiting for itself. Also add a FIXME comment describing a related recovery scenario we don't yet handle correctly. All lookups outstanding when recovery happens need to be resent after recovery. The RESEND flag was not being set on these lkb's, through, so the lookups were never being resent. 2005-06-03 sunjd correct some mis-typoes CVS patchset: 6552 CVS date: 2005/06/03 03:27:31 --HG-- extra : convert_revision : 2146e9df76531c8dc110c43c38250bc195f9dd3d 2005-06-03 David Teigland when freeing locks for withdraw, don't try to free fake lvb's 2005-06-02 David Teigland correctly interpret the return value of do_barrier don't need to include lvb_table.h 2005-06-02 sunjd msoffen's missing stuff is back now. ;-) CVS patchset: 6538 CVS date: 2005/06/02 03:13:18 --HG-- extra : convert_revision : 60af806090785646c41b91a5c8b4f7d9573a0dd7 2005-06-01 A. J. Lewis o Fix for fenced portion of bz #155478 2005-06-01 Patrick Caulfield I always miss one of 'em. move everything around! split cman into several files: cnxman.[ch] Comms membership.[ch] Membership thread commands.[ch] Processes commands from libcman barrier.[ch] Barrier code logging.[ch] Logging daemon.[ch] Daemon control. the select loop & messaging. 2005-06-01 David Teigland recovery timer can't be global, it must be per-lockspace 2005-06-01 Patrick Caulfield Add option for 2-node cluster. Fix create help. Fix buffer overflow if output filename was longer than input filename. 2005-06-01 David Teigland fix calculation of previous low nodeid look through correct list for failed nodes needing recovery 2005-06-01 Patrick Caulfield Use sockaddr_storage rather than sockaddr_in6 2005-06-01 David Teigland start group id's at 1 instead of 0 close fd's of dead clients - include everything that needs sending in the standard message struct instead of having extra data follow the msg struct - fix up how to_nodeid is handled in messages so it gets byte swapped like the other fields 2005-06-01 Patrick Caulfield Fix potential SMP race 2005-06-01 sunjd temporarily disable parts of msofften's submitting on perl binding which breaks the building. likely msofften miss some files to commit CVS patchset: 6517 CVS date: 2005/06/01 07:28:52 --HG-- extra : convert_revision : a659a51b028f4c7a7e3d13ebc0502e8f192e33a2 2005-06-01 msoffen Fixed to allow SWIG to build in separate source tree CVS patchset: 6516 CVS date: 2005/06/01 04:37:20 --HG-- extra : convert_revision : 8c66d9cba7aa47c85eafb43dbf152a6a87cf690e 2005-06-01 David Teigland remove repeated include of module.h header munging to match upstream correction of export symbol lvb_operations 2005-06-01 alan Put in a number of changes related to longclock_t. Along the way, I discovered that the code really didn't work right if you actually wrapped around on a 64-bit machine - which is more than a little unlikely. And, I undid some "error checking" changes done by others - along with an explanation of why I undid them. CVS patchset: 6511 CVS date: 2005/06/01 03:34:32 --HG-- extra : convert_revision : 3be30b63f6b2cd92c8d3e2bab11eec57e169a4d4 2005-06-01 David Teigland Work around gcc-2.95.x macro expansion bug (from akpm) 2005-05-31 David Teigland kobject was being freed too early in withdraw use correct list field in group struct only print debug lines to stderr when -D is used remove noisy debug line remove temp debug bits, add 'dump' option to get debug log from groupd new file bits for withdraw only print debug info to stderr when -D is used changes to debug logging 2005-05-31 Patrick Caulfield Fail if the nodename maps to the loopback device. 2005-05-30 David Teigland export dlm_lvb_operations symbol so dlm_device module can use it 2005-05-27 alan Put in some changes to allow a "try" option for the SNMP code and the SWIG code as per a suggestion from David Lee. CVS patchset: 6457 CVS date: 2005/05/27 20:26:53 --HG-- extra : convert_revision : ed83830bf611e24f17568a61025518a026e03921 2005-05-27 David Teigland tidy error handling don't depend on leave state here, always go to groupd 2005-05-27 Patrick Caulfield Bring forward some fixes from the kernel-based cman. 2005-05-27 David Teigland option to list only one group option to connect to groupd as a client for debugging a lot of fixes, use messages instead of barriers from libcman changes to debugging output, don't try another leave if one is in progress Set ls_first for gfs when a spectator is first to mount so gfs can bail out if there are any journals. ls_first still always set for first participant that mounts. install daemon 2005-05-27 Ken Preslan o Fix a bug I introduced that would keep GFS from replaying a journal on first mount. o Fix a race where an incore inode could be dealloced twice. o Other munging. Fix errors on rebuilds. 2005-05-26 Lon Hohberger Don't assume child nodes exist just because someone asks for them Ask for node name instead of for the existence of children from ccs 2005-05-26 davidlee rev 1.335 had introduced unconditional use of "mktemp" program. Various platforms don't have this. So detect, and have fallback to "touch". It had also introduced a subtle potential problem, in reassigning WHOAMI. Had earlier correctly been "/path/to/whoami", but became "whoami". Restore to "/path/to/whoami" behaviour. CVS patchset: 6426 CVS date: 2005/05/26 13:06:29 --HG-- extra : convert_revision : 1564ff10531c8d036c034566ae00dee4194a6d4f Supplement 'FatalMissingThing' with 'WarnMissingThing'. Let '--enable-swig' take 'try' option. CVS patchset: 6424 CVS date: 2005/05/26 08:13:15 --HG-- extra : convert_revision : 1f59f9b58b02e68fc07b5d5c4ff0d5c20b9bcedf 2005-05-26 David Teigland restart events after a delay tidying, do ifdefs around debugfs functions in a consistent way fix-dlm-extern-lvb_table.patch fix-dlm-without-debug.patch dlm needs 2.6.12, don't build until that's out 2005-05-25 Lon Hohberger Make magma_ucman deliver CE_SHUTDOWN when it should; make magma_tool ignore SIGPIPE so it can trap for closed sockets. 2005-05-25 Jonathan Brassow - Teigland's patch to make CCS skip clustering and just read the local cluster.conf 2005-05-25 David Teigland memcpy all the data after cman_get_nodes copy all the data from cman_get_nodes 2005-05-25 Patrick Caulfield Don't lose the port number 2005-05-25 David Teigland use libcman to check if victim has rejoined Need to check the cn_member field to tell if a node is a member, not just that it's in the list returned from libcman. not doing patches list new daemons that need to be started clean and install in group dir install groupd and group_tool build in lock_dlm add makefile don't build cman-kernel build src2 instead of src remove cman-kernel usage don't build deprecated cman and sm dirs look for libs and lib headers within the cluster tree get libdlm.h from the correct place within the tree 2005-05-25 Ken Preslan o Allow the appropriate FS-specific mount options to be changed on remount (BZ 156780). o Fix statfs when in spectator mode. o Cheat to make permission() faster. o Simplify truncate code. o Get rid of unnecessary RO flag. o Update man pages. o Other munging. 2005-05-24 Jonathan Brassow - fix SEG FAULT 2005-05-24 David Teigland Dynamic journal ids, done here using a simple message through libcman (may want to use dlm locks for this again sometime.) avoid double free of rsb's lvb when clearing lockspace 2005-05-24 Patrick Caulfield Use getaddrinfo rather than the (obsolete) gethostbyname2 call. 2005-05-24 David Teigland zero padding for id complete some missing bits; verify cluster name at mount, verify fence domain is joined at mount, support spectator option simplify code in the mount path fix uninitialized pointer in daemon, handle get_groups properly in lib when there are none to return align text recognize get_group request in daemon, use memcpy for info data, print debug state in group listing 2005-05-24 alan Put in 1.99.5 version number. CVS patchset: 6387 CVS date: 2005/05/24 05:20:02 --HG-- extra : convert_revision : 5f3391b7a9c633a9ab7eba8acf2aa83ff45b1d67 2005-05-24 David Teigland munge group data returned in query 2005-05-23 Michael Conrad Tadpol Tilstra fixup man page. 2005-05-23 Patrick Caulfield Only print "waiting for cman" if verbose flag is set. deprecated. Add a userland-cman plugin for magma. I've only tested this with ccsd. Lon, you might like to check this ! 2005-05-23 David Teigland implement group_get_group() to get info for single group 2005-05-23 sunjd Bug 541:OCF directory should not be relocatable by autoconf CVS patchset: 6377 CVS date: 2005/05/23 09:32:02 --HG-- extra : convert_revision : 421c40d8f382fedaeb9add91a6923fe5bf543c84 2005-05-23 David Teigland make spectator and withdraw options visible through sysfs Implement join/leave info: a small string of app-specific data that an app can provide when joining or leaving that can be read by the other members. some byte-swapping calls were still commented out remove unused length field in message remove devel files replace with new version of lock_dlm in devel/ put back flags to ignore certain messages, other minor stuff 2005-05-20 David Teigland initial bits to report group listing fix leave re-enable byte-swapping of messages 2005-05-20 Patrick Caulfield unregister_lockspace() now works. 2005-05-20 David Teigland the call to process_recover_msg() had been commented out during porting and never uncommented - remove the group struct when we're done leaving - don't reference ev struct after it may have been freed various fixes completing switch to libgroup return 0 on success from group_join/leave build fence_tool add comment join/leave now just send messages to fenced which must already be running changing things so group_join/group_leave are initiated by fence_tool messages munging 2005-05-19 Chris Feist Changes to the way the local build works: Instead of running a make install from the top level into a 'build' directory we now just configure each package with the proper include directories for header files and libraries. This should help prevent accidently using files that are locally installed on the system (instead of the files in CVS). This should also make it easier for external people to build from HEAD. Please let me know if there are problems or something doesn't work quite right. 2005-05-19 David Teigland better debug output 2005-05-19 andrew An option for making the CRM use regular malloc/free - Useful when looking for memory leaks as it will point to the place in the CRM that allocated it rather than to cl_malloc. CVS patchset: 6317 CVS date: 2005/05/19 10:52:52 --HG-- extra : convert_revision : 56a9708106fed9f09f1d343ca499eb7487d5f6b2 2005-05-19 David Teigland clear old recovery flags (LOCKS_VALID) when recovery begins don't build fence_tool, not updated yet new version using libgroup and libcman add missing make_args 2005-05-19 Ken Preslan o Changed the lookup so it doesn't need to take the new inode's glock. This simplifies it a lot. o Other cleanup. Fix BZ158133. Fix an oops that occurs when an acl is set that consists of nothing but a header. munge Fix BZ158133. Fix an oops that occurs when an acl is set that consists of nothing but a header. 2005-05-18 andrew Configure flag for enabling dmalloc libs NEEDS TESTING: If we already have libltdl, then use it and dont bother configuring the libltdl directory CVS patchset: 6311 CVS date: 2005/05/18 21:29:57 --HG-- extra : convert_revision : 6d4278f59b234323a74ec95ffd486f821bae9eda Use the same approach for libtool as for make Dont supply a default to AC_CHECK_PROGS so that the following test can fail Prefer g* variants if available CVS patchset: 6304 CVS date: 2005/05/18 17:37:22 --HG-- extra : convert_revision : e75682930eb8497eeb6664d4729ac9d8a508ef34 2005-05-18 Chris Feist Added LDFLAGS variable in Makefile so dlm_tool will build if libdlm_lt is not installed. 2005-05-18 David Teigland missed new file add group_leave arg add daemon and dlm_tool fix sprintf's really use libgroup use libgroup add Makefile 2005-05-18 Patrick Caulfield Add dummy struct to keep compilation clean. Missed header, sorry. 2005-05-18 David Teigland don't add duplicate local addresses add \n to printk's change read_lock to read_unlock 2005-05-17 Patrick Caulfield Build against dlm-kernel/src2 This (temporarily I hope) disables the query interface as dlm2 only has the base locking primitives ATM. Improve "can't connect to cman" error. 2005-05-17 David Teigland remove unused function add _GPL to EXPORT_SYMBOL 2005-05-16 Patrick Caulfield Use a different method for findin broadcast addresses. interfaces of the for "eth0:0" are now usable. 2005-05-16 David Teigland don't add connection if accept fails lib interface for groupd 2005-05-16 msoffen Replaced AC_PROG_LIBTOOL - Was failing on FreeBSD - will still work on Linux CVS patchset: 6280 CVS date: 2005/05/15 23:13:57 --HG-- extra : convert_revision : 25a0c0b0b4eaa22cd113c61e1a7d6ad61a9bb236 2005-05-13 Patrick Caulfield Don't allow name= as a fence argument as it causes problems. 2005-05-13 David Teigland daemonize 2005-05-13 Lon Hohberger Fix for example.conf 2005-05-12 Lon Hohberger Fix arg swap problem when reading from stdin Fix arg parsing 2005-05-12 David Teigland no parens around defined values don't want .orig from patch change debug message end files with \n remove trailing white space - remove a couple FIXME questions - call confirm_master() in grant_after_purge() in case the node became master of the rsb during recovery and had locks waiting for master confirmation before being processed fix up some comments - get all lines under 80 - replace printk with log_print() - remove unused function remove the timeout in wait_function which was for debugging, and don't print all rsb's when clearing the recover_list Move recover_rsbs() to the main recovery section, requires a new status barrier after recover_locks(). dlm_recover_members_wait() and dlm_recover_directory_wait() moved to recover.c; wait_status functions are now static Wait until all locks are recovered before doing lvb recovery. This simplifies the process and doesn't waste time running the lvb recovery routine multiple times on some rsb's. spinlock fix from patrick 2005-05-12 Ken Preslan Fix bug #129468 Serialize the block mapping code so a writepage() call can't see the file tree in an inconsistant state. 2005-05-12 andrew Didnt seem to be working, does now. CVS patchset: 6234 CVS date: 2005/05/11 22:10:44 --HG-- extra : convert_revision : 88599731bbb976c41af09c69d14deba363e506d1 2005-05-11 Ken Preslan Refix a problem with the rename lock. Don't associate it with rename transactions. 2005-05-11 Lon Hohberger fix targetted relocation bug when running with gulm Bull PAP + Bull IPMI-over-LAN support 2005-05-11 Ken Preslan Fix a couple of rename() bugs. 2005-05-11 Lon Hohberger Fix API change: we no longer get Logged_out after Fenced 2005-05-10 Ken Preslan gfs2_jadd and gfs2_grow. o Add back code to support gfs2_grow and gfs2_jadd. o Start checking for dirty journals when freezing again. 2005-05-10 Lon Hohberger Fix fd leak, change resource-group -> service, fix node ID display Fix file descriptor leak in services.c 2005-05-10 Patrick Caulfield Refill the nodes write queue once we are woken up after -EAGAIN. 2005-05-10 David Teigland use cman barriers again adjust size of write, remove prints 2005-05-09 Lon Hohberger Fix 157248 2005-05-09 Patrick Caulfield Address numbers start at 1 Add options to cman_dispatch() so that callers can filter out non-interesting messages. These will be saved on a list and sent when the user is again interested in them. This is used by the query functions, they prevent and DATA or EVENT messages being dispatched while waiting for (eg) barriers to complete. 2005-05-09 David Teigland send a status request to every member at the start of recovery just to establish lowcomms connections with everyone make purge_queue() more general purpose, no functional change Tiny code change to implement significant optimization: if a node receives a lookup and is the master itself, process the lookup as a request and return a request reply. - big reordering of functions in lock.c to avoid declaring some prototypes at the start - add inline to a bunch of lock.c functions - make lock_rsb/unlock_rsb lock.h inlines 2005-05-07 Ken Preslan Update. Fix problem of writepage() needing to map blocks at weird times when another process might be changing the metadata tree. Added some locking so the writepage never sees an inconsistent tree. Plus some cleanups in the recusive glocking code. Add "gfs2_tool getargs" to get the gfs-specific mount arguments used to mount a filesystem. 2005-05-06 Ken Preslan Clean up journaled data code and metadata I/O code. 2005-05-06 David Teigland ignore start events if lockspace is running to avoid an assert failure NEW_MASTER flag wasn't being cleared on rsb's remastered locally move is_master() to lock.h and use it in recover.c 2005-05-06 Ken Preslan Unlock the page when erroring out of writepage(). Fix a misinitialization in the diaper device. 2005-05-05 Lon Hohberger Arbitrary resource tree patch 2005-05-05 Ken Preslan Munge. Fix broken makefile. 2005-05-05 alan Fixed timeouts and monitor levels on IPaddr2 CVS patchset: 6108 CVS date: 2005/05/05 14:19:52 --HG-- extra : convert_revision : 4b0cfecf041f4708d15ed4e1031b2f8f36d909ec Fixed the timeouts back like they were - since they're legal. Changed the depth parameter to 10 - to account for the fact that it's slightly more heavyweight than the lightest possible weight check. CVS patchset: 6107 CVS date: 2005/05/05 14:15:46 --HG-- extra : convert_revision : c1c09c516fe0007119f16d6710f94162895f1826 Fixed start delay for IPaddr Still think default level is wrong. CVS patchset: 6105 CVS date: 2005/05/05 13:47:14 --HG-- extra : convert_revision : 67b2caca5bcdf5c4f6887fdbc8c0b4ec03aaad02 Several Delay resource fixes: (1) added monitor delay (2) changed all delays so that they default to start delay (3) Got rid of separate "delay" parameter (4) made it support multiple simultaneous delay resources using resource instance for the file name stuff... CVS patchset: 6104 CVS date: 2005/05/05 13:40:10 --HG-- extra : convert_revision : cd2a6ad278843a820abd1423c5ebeb8cb772b592 2005-05-05 David Teigland status replies include a new "rcom_config" struct that is used to verify config params (like lvb length) are the same between nodes 2005-05-05 Patrick Caulfield Add the new commands to the ccs_tool man page. and fix a bad example. 2005-05-05 David Teigland improve comments and remove log_print about partial messages tidy a few things 2005-05-05 Patrick Caulfield If we failed to resolve the broadcast address, print the interface. 100:1 it will always read "lo". 2005-05-05 David Teigland fix a couple remaining over-80 lines 2005-05-05 Ken Preslan o Honor "data=ordered" for truncates o Partially completed truncates now resume after a crash o Fix a big performance problem caused by a mis-initialization in the diaper code. o Be smarter about deallocations when "data=ordered" is used. o Misc munging 2005-05-04 sunjd Bug 463:stops all resources on one NIC when we ask it stop only one resource; Correct the directory error; List all parameter names to the begin comment. CVS patchset: 6096 CVS date: 2005/05/04 19:28:17 --HG-- extra : convert_revision : a7c6992c662d51c71c42f1fd25cfc96a585dbae2 2005-05-04 Benjamin Marzinski Heres my fix of a fix of a fix. I wasn't initializing a list that needed to be initialized. 2005-05-04 David Teigland byte swapping wrong size in rcom struct 2005-05-04 Patrick Caulfield Remove some unnecessary includes. InitParser before doing a create. It doesn't seem to be necessary now...but. 2005-05-04 David Teigland specify lvblen when creating lockspace, 32 bytes kernel apps must specify the lvb size they'd like (multiple of 8 bytes) user apps continue to have a fixed size of 32 bytes do a schedule/retry when kernel_sendmsg() returns -EAGAIN, a work-around until write_space() works on the socket style stuff, mainly making lines under 80 make the waitqueue usage more standard 2005-05-03 Patrick Caulfield Move cluster_conf into ccs_tool 2005-05-03 David Teigland - changes to debug messages - increase size of tmp buffer in midcomms - null terminate string of nodeids from sysfs 2005-05-03 Patrick Caulfield Check in the source rather than a binary, sigh Add command-line utility for managing cluster.conf files. 2005-05-03 David Teigland skip barriers for now improve debug logging 2005-05-03 Benjamin Marzinski This fixes bz # 156635. A variable that I allocated statically in my fix for 155597 needed to be allocated dynamically. 2005-05-02 andrew Because it is passed as an action parameter, it will have an OCF_RESKEY_ prefix. So copy into a variable that doesnt. CVS patchset: 6063 CVS date: 2005/05/02 10:58:38 --HG-- extra : convert_revision : cdbde66c347c5f33c20d1d425a75ebed7a909a5e 2005-04-29 Ken Preslan Don't do write_inode if we're PF_MEMALLOC. Munging. munge. Fix ACL leak. 2005-04-29 Benjamin Marzinski Fix for bz #155597. GFS used to be able to write over a portion of the log while it was still needed. The fixes that. 2005-04-29 Lon Hohberger Fix frim Birger 2005-04-29 David Teigland need del_timer_sync() with the new timer-based dlm_wait_function() that was recommended 2005-04-28 David Teigland patch from Steve Dake to avoid using clm library ast_queue_lock can be a spinlock 2005-04-28 Ken Preslan Add back NFS support. 2005-04-27 Ken Preslan Get rid of an osi_. 2005-04-27 Lon Hohberger Fix timeouts for 32-way bull machines 2005-04-27 Michael Conrad Tadpol Tilstra gcc4-isms fixup libs 2005-04-27 David Teigland - need to lock_rsb/unlock_rsb in recover_locks() because dlm_recover_process_copy() may operate on the same rsb before we're done - in recover_locks() we should only add the rsb to the recover_list if there were locks recovered for it, it was added unconditionally - need to break from dlm_recover_locks() if recover_locks() returned an error, we weren't checking the return value CVS ---------------------------------------------------------------------- spell out DLM at the top - check if lvb alloc failed - move new code for async convert into send_convert (cleaner) 2005-04-27 Patrick Caulfield events are not REPLYs. Allow cman_tool to override the node name when joining. 2005-04-27 David Teigland - set lkid in user's lksb earlier so they don't see 0 - do async (ack-less) remote down-conversions 2005-04-27 Patrick Caulfield Use local libcman 2005-04-27 David Teigland go back to schedule_timeout() instead of msleep() in dlm_scand, not sure what the msleep was doing - dynamically adjust the delay when polling a node for its status, starting with 20 ms and adding 20 each time. - add to debug message the time recovery took don't wrap wait_event in an infinite loop and use a timer also return -EBUSY for convert if lkb_wait_type is non-zero put the '*' just to the left of the structure field 2005-04-27 Ken Preslan Fuzzy statfs(): When statfs() is called, the return value is the state of the whole filesystem was in sometime in the last X seconds plus any local changes. This algorithm is very similar to the way we do quotas. X is a tunable parameter, statfs_quantum. (The default is 60 seconds.) A statfs() call now requires no network or disk I/O. 2005-04-26 Ken Preslan Add some new block types. Rearrange some assignments to work with gcc4. 2005-04-26 Patrick Caulfield Use local libcman 2005-04-26 David Teigland make scan/toss_secs dlm_config values, get rid of empty lockspace_exit was hoping to avoid this, but search on every lkid we create to verify it's not in use 2005-04-26 Patrick Caulfield Undo some of the "tidying" done by indent. Use rwlock rather than rw_semaphore 2005-04-26 David Teigland depends on INET, select IP_SCTP 2005-04-26 Patrick Caulfield Remove some redundant stuff. change uint*_t to __u* 2005-04-26 David Teigland misc other formatting and tidying from reviews newlines some misc tidying, use printk log levels - get rid of parentheses around defined values - use the kernel.h max_t() instead of our own MAX() clean up some spots where we don't need error handling, complete the error handling in some other places use msleep and ssleep instead of schedule_timeout "But, in any case you might as well move the label 'top' inside the if just before the for loop, since the only place you ever goto top you've just set ri->next to NULL, so you know you are going to end up inside the if in any case, no need to actually do the test every time." Jesper Juhl 2005-04-26 Ken Preslan Remove some debug code. GFS2. Still a work in progress. But it should be faster. 2005-04-25 David Teigland update FIXME comment 2005-04-25 andrew Fill in some fields automatically if not set Fix a typo CVS patchset: 5947 CVS date: 2005/04/25 12:40:17 --HG-- extra : convert_revision : a42502e8a4fc6fd45a73fb5d66e016577765dd59 2005-04-25 David Teigland read_lock will do when creating root_list move some recovery-related functions from lock.c to recover.c more fixes adjust Initial untested code for recovering conversions between PR and CW. reduce debug noise In-progress down-conversions should just be completed at the start of recovery. Note that PR/CW conversions need work. return an error if no local addr's are set free rsb's on toss list during recovery so they don't need to be recovered; wasn't freeing lvb's on rsb's - grant_after_purge only on master rsb's - dlm_message_in is done before saving on requestqueue (not again after) - reject finish events with incorrect event_nr - don't set lkb_status directly when receiving new master-copy lock during recovery; add_lkb wants to set that itself - hold/put lkb needed around unlock/cancel processing in dlm_recover_waiters_pre (match correct refcounting in normal case) - copy args to lkb in receive_unlock/cancel only after locking the rsb (like we now do with validate_args) - return errors instead of asserting in some spots - copy args to lkb only after the range/lvb alloc which may fail - don't scan for rsb's to free while the ls is being recovered 2005-04-24 Ken Preslan Quit yer whining. Let there be compilation! Rearrange order so it actually compiles. 2005-04-23 David Teigland Pass nodeid/addr info directly from node_ioctl.c to lowcomms where it's used, removing the extra staging of the info in member.c where it's not relevant. Also a little lowcomms tidying. 2005-04-22 alan I changed the ocf-shellfuncs script to allow OCF resource agents to be invoked as init scripts in the fashion which we discussed in the standards meetings we held a while back. The previous version (mistakenly) prohibited that. Also fixed up the logging both for OCF and non-OCF resource agents to be much more informative and uniform when compared to the the way things used to be done. CVS patchset: 5930 CVS date: 2005/04/22 21:36:38 --HG-- extra : convert_revision : d024d595ace44ce70e4481943f216096ed127975 2005-04-22 Lon Hohberger msg_init isn't ready yet; remove for now 2005-04-22 Patrick Caulfield Make debugging comfigurable without a recompile. Read config variables (those that were in /proc) from CCS. Lots more comments in libcman.h Rename cman_get_join_count() to cman_get_subsys_count() to avoid confusion. 2005-04-22 David Teigland Fix one of the big fixme's: immediately after finding the lkb in dlm_lock() and dlm_unlock() we were checking and filling in the lkb's fields all without the rsb locked. We now pass the necessary input args down to the next stage so they can be safely saved in the lkb after the rsb is locked. 2005-04-22 zhaokai add complile for hto-mapfuncs and BASENAME command path CVS patchset: 5918 CVS date: 2005/04/22 08:25:41 --HG-- extra : convert_revision : 11836fdbab88b67a1e774063748388b502311743 2005-04-22 David Teigland remove some complexity get rid of DLM_RELEASE_NAME update description of dlm.h flags remove query bits that were commented out of device code seems EXPORT_SYMBTAB isn't used any more 2005-04-22 zhenh add OCF IPaddr/IPaddr2 regression test CVS patchset: 5907 CVS date: 2005/04/22 02:53:46 --HG-- extra : convert_revision : 627021af1fdfefff61d0ec46263f50abf8b34403 2005-04-21 Lon Hohberger Fix GCC4 warnings 2005-04-21 Patrick Caulfield argv[0] should be "cmand" not "cman" 2005-04-21 Michael Conrad Tadpol Tilstra install symlinks to the .so make .a with ar not ld - patch via Fabio M. Di Nitto 2005-04-21 Patrick Caulfield Don't leave debugging on by default. libcman to go with userland cman daemon Userland cman daemon. 2005-04-21 David Teigland - remove hierarchical sections that were commented out so it can all be added together - clear up naming inconsistency lkid/remid parent_lkid/parent_remid 2005-04-21 alan Put in code to make the inclusion of sys/prctl.h contingent on its existence ;-) CVS patchset: 5877 CVS date: 2005/04/21 04:19:58 --HG-- extra : convert_revision : 666df4ab5d65c8ba5dd1e7dfd4c83de5f70b455e 2005-04-21 David Teigland remove sbf flag 2005-04-20 alan Changed configure.in so that it defaults to turning on -funsigned-char flag if the CC in use supports it. CVS patchset: 5863 CVS date: 2005/04/20 17:57:42 --HG-- extra : convert_revision : 485df1e1335b631427a17d48c111680e846e8433 2005-04-20 Lon Hohberger Add NBB1600 support + fix IPS800[CE] support 2005-04-20 Patrick Caulfield zero the difference between a sockaddr_* and a sockaddr_storage use lvm_operations array to determine whether the LVB was updated or not. 2005-04-20 David Teigland Move the __lvb_operations table into lvb_table.h so it can be included by device.c use long to save pointer val pack_rcom_lock/receive_rcom_lock_args was missing lkb_status and astaddr's use prefered inline Improve logic that delays and reduces fencing. When fenced is recovering for a failed node, the 'post_fail_delay' is used to give victims some time to rejoin the cluster and avoid being fenced. If this happens once, then it's likely to happen again and the 'post_join_delay' is more appropriate, so fenced switches to the 'post_join_delay' value (if it's larger which is usually the case.) The common situation where this helps is when multiple nodes fail causing the cluster to lose quorum and then the failed nodes all rejoin the cluster at about the same time. The rejoining nodes are more likely to all avoid being fenced if fenced uses the larger post_join_delay. lock_dlm now exits, and pool doesn't a couple updates queries are waiting until after the first round remove print formats we're not using patches used by 'make patches' to generate complete dlm.patch update make patches build stuff 2005-04-19 blaschke Bug 361 - Added support for ST_CONF_XML to all STONITH plugins, where the XML looks like the following for ipaddr: ipaddr The IP address of the STONITH device Also did a bunch of cleanup in STONITH plugins: 1) Added/fixed ST_DEVICEID, ST_DEVICENAME and ST_DEVICEURL in get_info() where necessary 2) Changed GetAllValues to CopyAllValues 3) Removed wasted indirection in LOG macro() invocations, i.e. LOG(PIL_CRIT, "%s", "out of memory") 4) Removed _() macro invocations 5) Fixed ignoring NULL return from REPLSTR() macro 6) Replaced pluginDevice->config with pluginDevice->sp.isconfigured 7) Removed parse_config_info() in nw_rpc100s and rps10 because set_config() used snprintf to put all parms into a string and called parse_config_info() to extract the pams via strtok 8) Replaced sprintf with snprintf 9) Replaced glb_debug with LOG(PIL_DEBUG 10) Used StringToHostList and CopyHostList to replace corresponding code where possible 11) Replaced malloc, free, wtc. with MALLOC, FREE, etc. 12) Fixed numerous memory leaks 13) Fixed dereferencing returned memory without checking for NULL 14) Fixed reset_req() in ipmilan, nw_rpc100s, rcd_serial, rps10 and wti_nps where the host was lower-cased but then not passed into strcmp 15) Added api_nexxus_connect() back to vacm, which was removed in 1.10 16) Enhanced get_info() in ibmhmc to include HMC version in ST_DEVICENAME 17) Reorderd plugin build order alphabetically 18) Moved riloe.in to lib/plugins/stonith/external/ and some cleanup in STONITH library: 1) Changed GetAllValues to CopyAllValues to indicate that copies of value strings are returned 2) Used StonithPIsys->imports->malloc et. al. instead of C library malloc to handle memory so that library and plugins use same memory pool 3) Fixed dereferencing returned memory without checking for NULL 4) Rewrote stonith_types() to copy strings returned from PILListPlugins() CVS patchset: 5855 CVS date: 2005/04/19 18:13:36 --HG-- extra : convert_revision : a4b7796fa91a3de688b45c1e3a7e2c68f9c3038a 2005-04-19 Lon Hohberger Fix options description 2005-04-19 David Teigland unmount bits recover_done, not done is sysfs file name don't confuse lock_dlm's uevents as ours misc fixes userspace version of kernel's list.h lock_dlm daemon manages mount-group membership in userspace wasn't closing fd's for sysfs files include groupd.h process group manager preparing to build from linux/drivers/dlm/ update copyrights - make functions static - add dlm_ prefix to some functions - remove stuff not being used byte swapping 2005-04-18 Benjamin Marzinski fixed bug that caused gnbd_monitor to only successfully monitor a device until it failed once. Also, made it so gnbd_monitor didn't wait for all the users of a failed device to close it before trying to reimport it. 2005-04-18 David Teigland use correct libcman can now use openais most of the byteswapping dlm_member gone dlm_node.h replaces dlm_member.h add empty byte swapping functions node ioctl changes simplify the ioctl bits now that they're only used for setting node addresses, and rename to node_ioctl since it's no longer related to lockspace membership 2005-04-15 Patrick Caulfield Add DLM_SBF_LVBUPDATED, needed by userland i/f Seperate out device.c into its own module that only depends on the external DLM interfaces. 2005-04-15 David Teigland cancel waiting locks; remove last lkb ref in both revert/remove_lock so they're consistent reject invalid event_nr's on start add missing wake_up fix bug in recover_master_copy 2005-04-14 Ken Preslan Fix bug #154902: Replace the function that gets confused on certain device sizes with a different function -- a new and improved one, that always knows what it's doing. 2005-04-14 Lon Hohberger Fix bonding link detection (port from clumanager 1.2.26) 2005-04-14 David Teigland don't do useful work in an assert macro 2005-04-14 Patrick Caulfield Set unlock artarg 2005-04-14 David Teigland a couple fixes next version of lock_dlm, a bunch of stuff is moved to userspace quit waiting for fenced to join (-w) after 10 seconds if fenced hasn't even begun joining yet; usually means fenced has exited 2005-04-13 Adam Manthei Changes to the init scripts. BZ's 153739 and 153741. 2005-04-13 Jonathan Brassow - remove cmirror target 2005-04-13 David Teigland clear up confusing names update lvb recovery function misc recover fixes 2005-04-13 alan Added new extracttests python script for giving Andrew (and others) test logs they're happy with. CVS patchset: 5759 CVS date: 2005/04/13 07:09:07 --HG-- extra : convert_revision : cb6edff25ea49eec7e035b97ecf55131cb1d1f43 2005-04-13 David Teigland complete more of the bits that are replacing rebuild.c 2005-04-12 David Teigland changes for debugfs 2005-04-12 andrew Remove echo from ip_init... it messes with ip_status and ip_monitor Only use calculated NIC/NETMASK/etc if none are provided Turn off "special" handling if lvs_support is not required Return the current $IF if on linux/sun so that ip_start can use it CVS patchset: 5750 CVS date: 2005/04/12 14:40:04 --HG-- extra : convert_revision : e9cdce65464f8688c4e84db237ab7077e44e7cf2 Point to the new home of the common CRM library CVS patchset: 5746 CVS date: 2005/04/12 12:57:24 --HG-- extra : convert_revision : 4696fdb6ec88fc08ac6b06df2d7d736aaa77dc72 2005-04-12 David Teigland misc minor updates comment out noisy logging refcount fixes, now requires post 2.6.11 version of kref_put 2005-04-12 andrew According to lmb: basically, being configured on "loopback" didn't count as running. And, that makes sense, because that just meant the IP was in "standby" mode for LVS. Which is good because its a lot easier to implement and means I can remove the OCF_SUCCESS_VARIANT rubbish. CVS patchset: 5734 CVS date: 2005/04/12 06:49:12 --HG-- extra : convert_revision : 9b897d2191f389e2089b90b2406191fe51e19cf2 2005-04-12 David Teigland same schedules that were added to the RHEL4 branch so serviced doesn't chew up CPU 2005-04-11 andrew O-oh, libxml is now used by the drac3 plugin - better revert most of the previous change. CVS patchset: 5715 CVS date: 2005/04/11 09:40:59 --HG-- extra : convert_revision : fb450e5499b6ca63d744d1d726cf763b942bf955 libxml2 is no longer required for linux-ha appologies if i screwed this up CVS patchset: 5714 CVS date: 2005/04/11 09:02:16 --HG-- extra : convert_revision : 0ed53a1131742959f0d6c7bf36ee24f4d8e5b5fa 2005-04-11 David Teigland outline for last recovery stage that replaces rebuild.c new rcom routines that use normal lowcomms get/commit_buffer change to per-ls the list of lkb's waiting for reply, split a couple reply-processing functions to remove some duplicated recovery code 2005-04-10 alan Print the return code if sendarp fails. CVS patchset: 5701 CVS date: 2005/04/10 05:07:30 --HG-- extra : convert_revision : 275a7bbd852b544a27e2d39a61517f9ddd4777b6 2005-04-08 David Teigland couple fixes and thread to free rsbs 2005-04-08 Jonathan Brassow - typos - no cman on upgrade. - commit changes so I don't loose them. 2005-04-08 andrew Expose various "special" functionality as parameter options and turn it off by default. Only report the resource running IFF it is running on *our* node. CVS patchset: 5679 CVS date: 2005/04/08 13:52:30 --HG-- extra : convert_revision : 31987d369dffff951b93f529ee90a1a1dc6e0d55 2005-04-08 Patrick Caulfield Get userland working again. 2005-04-08 David Teigland several fixes 2005-04-07 Benjamin Marzinski Modified gnbd so that it can be used with multipath easier. When you export a gnbd, it will try to grap a unique id from the underlying device with scsi_id. If it can't it will use the gnbd name as the unique id. It is possible to override this with -u. gnbd_import can get the unique id of a device it has with the -d option. 2005-04-07 David Teigland bunch of fixes 2005-04-06 andrew See comments - someone needs to set this correctly CVS patchset: 5627 CVS date: 2005/04/06 14:11:31 --HG-- extra : convert_revision : 9e908ff6d8ab5ae6ccc5146be182229fce071b3d Its not always part of an incarnatioN CVS patchset: 5626 CVS date: 2005/04/06 14:10:38 --HG-- extra : convert_revision : fc28a780791735785e93d4a5dbd9467088d1d917 Port the IPaddr script properly after wasting a week or so because of it Make the various ARP arguments parameters instead of reading a secret file Starting an already running resource is no longer an error Fix the status command Add $ to variable names when being returned Use return codes instead of printing "running" or "stopped" like OCF RAs are supposed to Make status work When not running return OCF_NOT_RUNNING instead of OCF_ERR_UNIMPLEMENTED CVS patchset: 5625 CVS date: 2005/04/06 14:10:04 --HG-- extra : convert_revision : adce1bcfd52359ddb94f1c44c490e984f9375216 2005-04-06 David Teigland better way of returning some errors, more recovery bits bug fixes fix some bugs copy other hash function from gfs which works as well, but without the big table 2005-04-05 David Teigland don't need cluster link add file 2005-04-05 Patrick Caulfield Say something if sendmsg fails. 2005-04-05 David Teigland remove files changes to existing code corresponding to new lock.c add files some fixes from testing 2005-04-05 Patrick Caulfield Add a test prog that got lost 2005-04-04 David Teigland backup recent work 2005-03-31 David Teigland back up work 2005-03-31 Ken Preslan Start including gfs_debug in the build. 2005-03-30 David Teigland backup recent work an example of using dlm_tool 2005-03-29 Patrick Caulfield Fix a couple of memory leaks Initialise idr_members lock 2005-03-29 David Teigland Add version number to the start of all dlm messages to help with future upgrading. This change is incompatible with previous versions of the dlm. 2005-03-29 andrew Have configure #define HA_LIBDIR and use it CVS patchset: 5491 CVS date: 2005/03/29 06:08:59 --HG-- extra : convert_revision : cfb5328eec3fa1b910e9d98ddf6d249098f48dbb 2005-03-29 David Teigland simplify dlm_astd handling by removing the wait_queue; and avoids a possible hang when shutting down dlm threads 2005-03-28 alan Preparation for release 1.99.4 CVS patchset: 5489 CVS date: 2005/03/28 18:55:02 --HG-- extra : convert_revision : 32f0697430cd4d0e88f2182522f667f7b1ef2ef2 BUG 346: Disabled discontinued code, turn on CRM, LRM by default, etc. CVS patchset: 5488 CVS date: 2005/03/28 18:39:34 --HG-- extra : convert_revision : 69374af22ce72e36e3309fc12a7e6351651c8100 2005-03-28 David Teigland add error message error message couple fixes use info from cman to do set_local/set_node calls into the dlm couple fixes include event_nr in done message to groupd another small bit for rsb refcounts 2005-03-26 David Teigland On Fri, Mar 25, 2005 at 03:22:38PM -0800, Daniel McNeil wrote: Looking at the code, the problem is a race condition between dlm_astd() and release_lockspace(). dlm_astd can pull an lkb off the ast_queue and still be processing it while the release_lockspace() is running calls dlm_dir_clear() and then kfree()s ls->ls_dirtbl. When dlm_astd() calls release_rsb() it leads to a dlm_dir_remove() which accesses the freed ls_dirtbl which is freed. With slab debug, this leads a spinning write_lock() and a hung umount. My machines are 2 cpu systems which also might expose the race condition. The fix is below and is fairly simple, just do the astd_suspend() in release_lockspace() before the dlm_dir_clear() and kfree(). That way astd won't be process lkb on the astqueue will it is being freed. 2005-03-25 David Teigland more work, largely refcounting related 2005-03-25 zhaokai fixed resource-agent name and version bug in meta_data , removed a debug comment CVS patchset: 5474 CVS date: 2005/03/25 01:31:49 --HG-- extra : convert_revision : 733d81d8ec8f1f76fa6578affcfd46955fc0c382 2005-03-24 Lon Hohberger Fix timeout bugs 2005-03-24 Patrick Caulfield Remove redundant struct member Fix memory leak if a joining node fails. 2005-03-24 David Teigland more work 2005-03-23 David Teigland version 2 of the central locking logic, eventually to replace locking.c and lockqueue.c 2005-03-21 Lon Hohberger add init.d to install set fix warning Use service instead of resourcegroup as root resource to match the UI and user expected behavior / terminology Fix various bugzillas (see ChangeLog) 2005-03-21 Patrick Caulfield Fix usage message for -n man pages are wnderful things - when you read them. Use hstrerror rather than strerror to print errros from gethostbyname2_r Use correct errno when reporting errors from gethostbyname2_r 2005-03-21 sunjd Add stonithd basic sanity check CVS patchset: 5452 CVS date: 2005/03/21 15:08:34 --HG-- extra : convert_revision : 942fc3681d7b92151dd465c5e5f79e4b162108dc 2005-03-21 Patrick Caulfield Set join time on local node 2005-03-21 zhaokai split one OCF argument to 4 argument and polished it with OCF spec CVS patchset: 5450 CVS date: 2005/03/21 10:00:11 --HG-- extra : convert_revision : d568e49a624269dbd1658ac9c25d75b7502e2f75 2005-03-18 Benjamin Marzinski added man page for gnbd_serv Syncing head with RHEL4 branch. Added fix for 151321. gnbd now defaults to getting the node name from the cluster manager Added Bastian Blanks patchs to the man pages 2005-03-18 David Teigland change sleep(1) to sleep(5) between a failed fence and a retry 2005-03-18 Patrick Caulfield Don't return an error on normal, synchronous, non-threaded unlock 2005-03-18 David Teigland use ls_debug_list for debugfs fix release and convert actions; use persistent flag on locks dlm_astd in wait_event_interruptible wasn't being woken by kthread_stop(), so quit using a wait_queue. 2005-03-18 Ken Preslan Fix precedence error. 2005-03-17 David Teigland a simple daemon to listen/talk to dlm in the kernel, connecting it with the userland group manager 2005-03-17 alan Updated version number to 1.99.3 CVS patchset: 5415 CVS date: 2005/03/17 05:41:00 --HG-- extra : convert_revision : 41a6337accb698431c00347d03c53f9dbf15a507 Put in a few spelling and documentation corrections. CVS patchset: 5413 CVS date: 2005/03/17 05:35:50 --HG-- extra : convert_revision : 06e2bfd2dbf213d8cb7aff3a054137eab9980b43 2005-03-17 David Teigland deal with some errors better hooks things up to debugfs making nodeid an int consistently 2005-03-16 lars Janitorial work: Stray \n removal. CVS patchset: 5397 CVS date: 2005/03/16 17:11:14 --HG-- extra : convert_revision : d2944370f945004cf662417ad326ad9a7a1c3d73 2005-03-16 David Teigland moving lock dumps to debugfs, copying previous proc.c and ipoib_fs.c 2005-03-16 Patrick Caulfield Increase size of gethostbyname_r buffer and improve error if it fails. 2005-03-16 David Teigland increase DLM_ADDR_LEN to 256 bytes -- must be at least as large as sockaddr_storage (128) need the sockaddr struct, not a pointer more addr length fixes - use the new timeout variety of wait_event - only copy length of sockaddr_storage to lowcomms remove usage of proc 2005-03-15 David Teigland cman doesn't provide proc entries 2005-03-15 Patrick Caulfield Slightly more sensible error returns for some join functions too. Replace the old nodeids array with idr_ routines. Don't need -lpthread Better error messages 2005-03-15 David Teigland fix output in /sys/kernel/dlm//members split some functions into another file so they can be shared 2005-03-15 zhaokai replaced ha_log with ocf_log CVS patchset: 5369 CVS date: 2005/03/15 07:48:06 --HG-- extra : convert_revision : d58568658fb78711aead9b26bd3ee15f887922d5 replaced ha_log with ocf_logi and change CVS patchset: 5367 CVS date: 2005/03/15 07:36:26 --HG-- extra : convert_revision : 822d3fdf0f58efb2513b33cc8cccf5d3cdb30035 replaced ha_log with ocf_log and polished return values CVS patchset: 5364 CVS date: 2005/03/15 06:52:58 --HG-- extra : convert_revision : 01685db937822edf6de0af26d1a8e1662d11c1b7 2005-03-14 Lon Hohberger Fix 151095 2005-03-14 Michael Conrad Tadpol Tilstra soft link libs for lon 2005-03-14 Lon Hohberger add resource rule printout to rg_test Clusterfs.sh / fs.sh fixups Show stdin options with -h 2005-03-14 David Teigland complete lock code 2005-03-14 Patrick Caulfield Make the _sync calls available to non-pthread applications. 2005-03-14 David Teigland command line interface for libdlm operations create/release/lock/unlock Use sysfs for lockspace control. And set_id is a new action instead of being part of start. Use sysfs for lockspace control. dlm-member ioctl's now only used for node id/addr settings. 2005-03-12 Ken Preslan Change it so a spectator has a 's' instead of the jid in its fsid. 2005-03-11 Ken Preslan Add a new "flags" parameter to lm_mount() of the lock module interface. When the flag LM_MFLAG_SPECTATOR flag is passed in to the lock module, GFS is asking to join the filesystem's lockspace, but it doesn't want to modify the filesystem. The lock module shouldn't assign a journal to the FS mount. It shouldn't send recovery callbacks to the FS mount. If the node dies or withdraws, all locks can be wipped immediately. If the lock module doesn't implement the flag, GFS will work as expected except that the mount will reserve a journal it will never use. Add and implement a "spectator" mount option to GFS to take advantage of the spectator mount flag. When the mount option is used, the FS looks just like a RO filesystem. The difference is you can have lots and lots of them. 2005-03-11 Patrick Caulfield Build libcman Remove unused variable Don't try to add too many addresses. 2005-03-11 zhaokai replace ha_log to ocf_log and fixed some running bug E.g offered wrong parameter of network interface CVS patchset: 5314 CVS date: 2005/03/11 10:13:25 --HG-- extra : convert_revision : 643bacaf6a07bb989718bb705ee587b069e0f1f0 2005-03-11 David Teigland make the weight arg optional copy ast_queue fix from src files When shutting down a lockspace we can remove lkb's from the ast_queue without holding the ast_queue_lock. The dlm_astd thread is suspended, but it's possible that queue_ast() could be called by someone which modifies the ast_queue. We now do the correct locking. copy cancel fix from src/ 2005-03-10 Ken Preslan Munged printk() ordering. Add "debug" mount option. Causes gfs_assert_warn() and gfs_lm_withdraw() to BUG(). 2005-03-10 Patrick Caulfield lowcomms for the new src2 DLM. This uses SCTP for transport and therefore should support multihome systems transparently. 2005-03-10 David Teigland misc small bits, handle >1 local addr. When gfs does an lm_cancel() we need to do a dlm cancel if that's where the lock is blocked. We were doing nothing. Changes also to deal correctly with the ast result from a dlm cancel and the potential for a cancel to return an error. Fix problems with cancelation. We weren't dealing with waiting locks being canceled, and we weren't sending back an adequate result for a remote unlock to deal with a cancel. 2005-03-09 Lon Hohberger Properly cast Change to arch-dependent char instead of uint8_t 2005-03-09 A. J. Lewis o Slight modification to fsck man page for new fsck 2005-03-08 Patrick Caulfield Fix dependancies for join_ccs.o Clean transitionreason after a state transition has finished. 2005-03-08 David Teigland fix small things to get working 2005-03-07 A. J. Lewis o continue instead of breaking on errors in scan_inode_list() o Clear inode's metadata bitmaps when clearing the inode itself o Make sure the variable you're using for the block number is valid 2005-03-07 Lon Hohberger Fix 150481, part 2 2005-03-07 Ken Preslan Compiling is always nice. 2005-03-07 Lon Hohberger Misc. bugfixes 2005-03-07 Patrick Caulfield Set close-on-exec flag on DLM file descriptors 2005-03-07 David Teigland matching dlm_nodeid_addr/dlm_addr_nodeid standard copyright device node stuff for ioctl, largely copied from dm #define misc name 2005-03-07 zhaokai add missing fi in ocf_log() CVS patchset: 5270 CVS date: 2005/03/07 07:40:05 --HG-- extra : convert_revision : ecbca83478f0e7835bd2bf96dfd830bfb20e5d46 2005-03-07 David Teigland Program for managing dlm membership in dlm-kernel/src2. New dlm development that won't be functional with gfs for a while. - Removing kcl_ calls to cman and replacing with control ioctls. If node isn't found in sm_members report an error and don't oops. This may be a valid condition, but it's not clear from the info. 2005-03-04 Michael Conrad Tadpol Tilstra Was storing entries under the same name, this prevented withdraw from working when there was more than one filesystem mounted. Fixed. 2005-03-04 Lon Hohberger Fix for multiple simultaneous leaves not being handled properly 2005-03-04 David Teigland Ignore any NEWLOCKS or NEWLOCKIDS messages from a previous instance of recovery. 2005-03-04 Ken Preslan Linux 2.6.11. 2005-03-03 Ken Preslan Bastian Blank 's manpage munging. 2005-03-03 A. J. Lewis o Fix the (d)inode_hash_insert() fxns (fixes bz #149706) o Update to latest ondisk.h from the s.r.c. gfs kernel source 2005-03-03 Ken Preslan Forward-port Tadpol's 146711 fix. 2005-03-03 Lon Hohberger Add -h for clustat / clusvcadm 2005-03-03 A. J. Lewis o increment link count when dealing with bad '.' & '..' entries o Clears up the spurious "Found unused inode marked in-use" msgs (bz #150207) o Make sure disk is synced at end, handle link count for l+f better o Changing GFS_METATYPE_EA -> ED - handles metadata change in GFS 6.1 (bz #150208) 2005-03-03 msoffen Fixed so that CC gets defined as PRETTY_CC only if you request it. CVS patchset: 5254 CVS date: 2005/03/03 19:40:23 --HG-- extra : convert_revision : b8164d57167ea93d5799cf4089befe56bb853795 2005-03-03 Lon Hohberger Fix build problem. 2005-03-03 Benjamin Marzinski Fixed 146672. While it is still possible to see this bug, the problem that I saw every time it happened to me was that the process in gfs_log_dump() got starved waiting for the sd_log_lock semaphore. This fix changes sd_log_lock into a rw_semaphore, and uses down_write and up_write instead of down and up. read write semaphores are totally fair, so gfs_log_dump() can't get starved 2005-03-03 Patrick Caulfield Tidy printks Replace the array of connections with functions from linux/idr.h If we are a new master, don't try to rejoin an old node. 2005-03-02 Lon Hohberger Fix part 2 of #150079 Zero out cm_addrs when reading /proc/cluster/services for a member list 2005-03-02 Patrick Caulfield Set the socket priority to INTERACTIVE to ensure that our messages don't get queued behind anything else 2005-03-02 Lon Hohberger Implement basic recovery policy handling #149522 #150067. Part of #149735. See ChangeLog. 2005-03-01 Jonathan Brassow - fix senario where a server dying could leave clients with regions marked that the new server would not know about, allowing for simultaneous writes and re-syncing operations. (Results in corruption). - Use mempool in places where memory allocation failures would make us unhappy. - Some code clean-up - Other bug fixes that I can't remember. 2005-03-01 Lon Hohberger Add support for Bull NovaScale machines via ipmi-over-lan and PAP management console 2005-03-01 Daniel Phillips Poptize csnap-server, mksnapstore the rest of the way 2005-03-01 Lon Hohberger Misc. fixes; see changelog. 2005-02-28 A. J. Lewis o Convert remaining (f)printf's to log_* o Check malloc and memset return codes, more error reporting - should help with bz#149706 2005-02-28 Patrick Caulfield Don't send sequence number of zero, it causes trouble. 2005-02-25 Patrick Caulfield Get rid of spurious "up" in barrier error path. 2005-02-25 David Teigland remove "lkb xxxx exists" message which can flood the console/log. it only occurs during a rare recovery condition Add a bunch of schedule() calls to potentially-long loops in recovery routines. checkin removing unfencing wasn't complete, sorry 2005-02-24 David Teigland Remove unfencing since it needs to be reworked and won't be ready for the next release. 2005-02-24 Adam Manthei uses -t option to cman_tool to time out wait operations 2005-02-23 Michael Conrad Tadpol Tilstra reduced things down to a single ASSERT. passing back error codes and retrying instead. (most of the error codes will cause gfs to withdraw/assert now instead.) Removed the gulm interface from kernel space. Its broken, and it doesn't look like anyone was using it anyways. 2005-02-23 Patrick Caulfield Need this too. Add ioctl32 support. Thanks to Bastian Blank 2005-02-23 A. J. Lewis o Release buffers after use o Fix double free or corruption after pass 5 (bz#149262) 2005-02-23 Patrick Caulfield Add -t option to join/leave/wait, specifies the maximum amount of time that the operation will wait before giving up (waiting that is). Also cman_tool join -w will retry if it fails ENOTCONN (ie the node did not join a cluster). See the man page for some wrinkles. Put some locking round membership_task so we don't try to wake up a process that has gone away. 2005-02-23 David Teigland When locks on the convert queue are granted, we need to try again to grant locks from the beginning of the convert queue. Locks at the beginning may not be grantable because of locks at the end. But, granting locks at the end (permitted when using the NOORDER flag) may make earlier locks grantable. Specifically, we have the following situation when running "gfs_tool freeze" in parallel on three machines: Granted 1 PR 2 PR 3 PR * nodeid 3 converts PR->CW Granted 1 PR 2 PR Convert 3 PR->CW * nodeid 2 converts PR->CW granted mode is demoted to avoid conversion deadlock Granted 1 PR Convert 2 NL->CW 3 PR->CW * nodeid 1 converts PR->CW granted mode is demoted to avoid conversion deadlock Granted Convert 1 NL->CW 2 NL->CW 3 PR->CW * conversions for 1 and 2 are blocked by 3's PR * conversion for 3 is granted Granted 3 CW Convert 1 NL->CW 2 NL->CW * other conversions are now grantable, we must try to grant them again 2005-02-23 Jonathan Brassow - break the cluster mirror file into separate files - commit changes to dm-log_cluster.c before its break-up 2005-02-22 A. J. Lewis o adjust the gfs ondisk format number Check that a device has been passed in (bz #149261) Fix spelling of "Succeeded" (bz #149267) Handle mistyped responses to queries more gracefully (bz#149278) 2005-02-22 Patrick Caulfield Remove kjoin.c as it's not used and won't even compile any more. 2005-02-22 Daniel Phillips Add libpopt command line args parsing to mksnapstore, csnap-server Argh! popt doesn't know about long long! 2005-02-22 David Teigland Recognize and resolve a second form of conversion deadlock. When it happens, you'll see the following in dlm_locks output: grant queue: empty convert queue: NL->EX, PR->EX Fixes bz 148861 2005-02-21 Adam Manthei remove the in_cluster() checking in stop. should no longer be needed since the start() case now waits to make sure that the node is in the cluster before returning. o added "-w" option to cman_tool join (bz #147828) o commented out CMAN_TRANSITION_RESTARTS o need mechanism for timing out "wait" parameter 2005-02-21 Patrick Caulfield I was trying to be too economical with code in cman_tool wait, ISQUORATE & GETMEMBERS can't really be conflated like that because they return different sorts of things. 2005-02-21 Michael Conrad Tadpol Tilstra If a local client logged into gulm, and the client got the nodelist before the local gulm connected to a Master gulm, the local client would not get updates when the full list was received. This fixes that. 2005-02-21 alan Setting things up for 1.99.2 release. CVS patchset: 5094 CVS date: 2005/02/21 06:22:54 --HG-- extra : convert_revision : 94cf4600afb8a84a5acd4f99baba892bde54d393 2005-02-19 Adam Manthei partial fix to cman init script to address bug #147828. still need to figure out why 'wait' isn't working like I expect. 2005-02-18 Chris Feist Added forgetten line in configure script which caused --sharedir to be ignored. 2005-02-18 Patrick Caulfield Display "mantis-friendly" membership state in /proc/cluster/status 2005-02-18 David Teigland "The attached patch makes it possible to ask fence_tool to not wait for quorum and just die in this case. This makes it easier to call fence_tool join in init scripts without the problem of a blocked startup." Bastian Blank ignore fence_tool's Q option 2005-02-17 Patrick Caulfield Deal with failed joinconf more sensibly. (that message may have to go though) Quick-quit out of the read/dispatch loop if a node goes down so we can process it in a timely manner. 2005-02-17 David Teigland remove some non-critical printk's Make dlm_recoverd thread a permanent fixure of each lockspace. The wake_up oopses I get when they're dynamic get in the way of other tests. This is already in the RHEL4 branch. 2005-02-16 Jonathan Brassow - commit man page changes from Bastian Blank 2005-02-16 andrew As it says, an option to enable -dev only asserts and testing code. Should be used by any v2 testers. CVS patchset: 4947 CVS date: 2005/02/16 17:59:10 --HG-- extra : convert_revision : 4cffc0030094cd9d9ec55891a9a1e18e693b28ae 2005-02-16 Michael Conrad Tadpol Tilstra is deprecated, not depercated fenced wants -O make sure lock replies goto clients. Better reaction to error packet in LT. 2005-02-16 David Teigland The current manpages are written in plain nroff which is not parsable by many scripts. The attached patch converts the manpages in the fence package to the man macro package. From Bastian Blank 2005-02-16 Patrick Caulfield This fixes the socket leak in the case where a primary connection was closed due to EOF on the socket, the secondary did not get closed. 2005-02-16 David Teigland We were ignoring blocking callbacks for locks being converted which caused us to skip some that were necessary. Fixes bz 147798 (along with similar dlm fix) Blocking asts were being ignored for all locks being converted which resulted in some necessary basts being skipped. In particular, after a failed NOQUEUE conversion, gfs could be left holding a lock and getting no callback for it while others were left waiting. This changes things so that a bast message is ignored if the lock is being converted and NOQUEUE isn't set, or if the locks is being unlocked. Fixes bz 147798. 2005-02-15 A. J. Lewis o Count blocks used by inodes and update them if counted doesn't match ondisk o convert free metadata to free blocks in pass5 2005-02-15 Daniel Phillips Add snapshot store expansion 2005-02-15 Patrick Caulfield Remember to tell SM if we get down to one node. 2005-02-15 David Teigland list fence_tool document the wait (-w) option The current manpages are written in plain nroff which is not parsable by many scripts. The attached patch converts the manpages in the fence package to the man macro package. From Bastian Blank remove stray line don't complain when we see fence_tool's -w option Add option to fence_tool to wait for the node to complete its join and be a member of the fence domain. Two options: fence_tool join -w fence_tool join; fence_tool wait 2005-02-14 Michael Conrad Tadpol Tilstra removed an unused func. made a log_err print out the value of ocde instead of just complaining about it. flipped a message out of lgm_Always and into lgm_locking. 2005-02-14 Jonathan Brassow - replace an instance of ccs_get with ccs_get_list - missing break statement, thanks to Bastian Blank 2005-02-14 A. J. Lewis o convert (f)printf to log_* in fs_dir.c o Fix link counting when reattaching '..' in pass3 2005-02-14 Patrick Caulfield On sparc & s390 do biarch checking at runtime. This paves the way for AMD64 builds if anyone has one they'd like to try ;-) Thanks to Bastian Blank for most of this. Display node name in /proc/cluster/status 2005-02-14 horms Added LVSSyncDaemonSwap CVS patchset: 4895 CVS date: 2005/02/14 08:32:28 --HG-- extra : convert_revision : 85be48aa7c10cc3ef7511d452a85f8411da0e93d 2005-02-14 David Teigland document -w and -q options for join remove multihome setup from man page 2005-02-11 A. J. Lewis Fixed number of times it's necessary to run fsck to completely clean fs o debugging, fix a bug in leaf handling, and clear check another error case for .. dirents o fs_mkdir was grabbing in use blocks - fixed this problem o Make sure journaled data gets set to the correct metatype and format o Make sure the leaf info is current before writing it out in metawalk.c o Fix allocation and expanding of l+f directory o Fix entry counting when splitting leaves o Convert dir_info list storage from a flat list to a hash table. 2005-02-11 Jonathan Brassow - add -V and -h to ccs_test 2005-02-11 Michael Conrad Tadpol Tilstra put the info that i thought was in the man pages in there. 2005-02-11 Patrick Caulfield Don't start the transition timer if we're doing the shortcut single-node endtrans. Also get rid of duplicate leave message 2005-02-11 David Teigland We add NL locks to a resource to implement gfs's hold_lvb(). When we request these NL locks, use the EXPEDITE flag so the NL request will be granted immediately. When a conversion request has been sent to a remote master, that remote master fails, and the sender becomes the new master of the resource, the requested lkb needs to be moved from the converting queue back to the granted queue prior to processing the conversion request locally. Previously, we were leaving the lkb on the converting queue. The specific problem I observed from this happened when the conversion was NOQUEUE and returned -EAGAIN. When an unlock for the lkb happened later on, dlm_unlock would find an incorrect status in the lkb of CONVERTING, return an error and trigger an assertion failure. don't do a dlm_lock_dump when an assertion fails cman_tool now picks as the local nodename whatever name has been entered for the local node in cluster.conf. If the node is represented in cluster.conf by a FQDN, then that's the nodename it will have. If the node is represented in cluster.conf by a hostname without domain, then that's the nodename that it will have. This should fix bz 146320. (As always, the network interface associated with the nodename is the one cman will use.) 2005-02-11 Chris Feist Merged changes from RHEL4 branch to fix building if not installing. 2005-02-10 Patrick Caulfield man page for cman_tool leave -w Add a -w (wait) option to "cman_tool leave". If the cluster is in transition then the leave ioctl returns -EBUSY which causes the leave operation to fail. If you add -w to the command-line then cman_tool will wait and retry the operation until it either gets a proper error or the leave completes sucessfully. This is probably what's really needed by the shutdown script. 2005-02-10 David Teigland Clean up changes from last commit related to bz 143487. fence_node -O is used to force a ccs connection, and a clustername arg is not used. 2005-02-09 Jonathan Brassow - fence_node has been change to work better with gulm (143487) - Two new options have been added -c -O The -O overrides the quorum requirement. It requires the use of -c, due to the fact that CCS may be in a transitional phase that only solidifies when the cluster is quorate. Although I haven't tested it, this may allow a person to fence_node -O -c someone_else_cluster someone_else_node... a feature? a bug? 2005-02-09 Michael Conrad Tadpol Tilstra Removed a bunch of unused global vars. (well, externs to atleast) Fixed bug 147602 - gulm doesn't allow 5 servers silly me, ment <=5 not <5. 2005-02-09 Benjamin Marzinski fixed journal corruption mentioned in bug 146672. If the latest entry written to the log before a crash was the at the segment before the last dump entry, the last dump entry would be overwritten on journal replay. Changed the checks in check_seg_usage to avoid this. 2005-02-09 msoffen Changed to have enable_ansi on by default CVS patchset: 4815 CVS date: 2005/02/09 15:36:53 --HG-- extra : convert_revision : 891ed9000a033e81b5b290e58b189fbd4bc7d3fd 2005-02-09 Patrick Caulfield Be a bit smarter about when to schedule() when reading lots of data (usually during recovery) 2005-02-09 David Teigland remove unused "bulk lookup" function Similar bug to the one fixed the other day. If recovery is aborted during rebuild_rsbs_send(), some rsb's could be left on the recover_list. This triggers the recently added assertion during the next recovery cycle. We now clear the recover_list if rebuild_rsbs_send() is aborted. - exit with an error if an invalid votes value is found - fix error in the votes value check in the two-node case. we were exiting with an error when no votes value was specified instead of accepting that the default of 1 is correct 2005-02-08 David Teigland - liblvm2clusterlock.so has moved from /lib to /usr/lib so we need: LVM2/scripts/clvmd_fix_conf.sh /usr/lib - no longer need manual devmap_mknod.sh for DM - remove info on cman using mulitple interfaces on multihomed nodes since it's out for now remove three log_debug lines that are called so frequently during recovery that they blow away other interesting messages in the debug log 2005-02-08 Lon Hohberger fix #146924 2005-02-08 David Teigland In rcom_send_message, return an error from midcomms_send_message instead of panicking. Returning an error from both functions is valid when nodes fail during recovery. Also switch some log_error to log_debug so they don't go to the console. 2005-02-07 Lon Hohberger Include rg name in clustat -x output 2005-02-07 A. J. Lewis Large batch of changes - mainly dealing with ExHash directory handling. o Fix a spot where a pointer not getting set to NULL would segfault later o Fix entry checking - large number of changes - still some issues o Make exhash leaf entry removal work correctly o If we find a bad .. entry in pass3, relink it to the treewalk parent o Mark exhash directories with bad height or depth invalid o Correctly create directories and handle dirs with height > 0 o Track leaf pointer counts and error directories they don't match depth 2005-02-07 andrew New test CVS patchset: 4755 CVS date: 2005/02/07 11:39:27 --HG-- extra : convert_revision : 7eb80aae3f42ef6e1dd06ab433568d795f1383e4 2005-02-07 Patrick Caulfield use $(CC) for linking. don't install header executable 2005-02-07 David Teigland remove utils_srt.c - Assert that the recover_list is empty at the start of dlm_dir_clear. If there's anything incorrectly left on that list from a previous aborted recovery, we'll find it here and not with an odd oops later. - Change log_debug() for normal locking operations to log_debug1() where DLM_DEBUG1 isn't set by default. This way abnormal or infrequent conditions from log_debug() aren't quickly blown away. - Don't call dlm_locks_dump() when DLM_ASSERT fails. - In rcom_send_message(), only test/clear the READY flag when doing a synchronous request. It's not used for async requests and because these async requests can be called from different threads it could lead to incorrect assertion failures. - In rcom_send_message() get rid of the RECCOMM_WAIT flag which was only used as a sanity check. Given that rcom_send_message can be called from multiple threads and only synchronous calls are serialized, an async caller could change the flag for the sync request and cause the reply message to be discarded. If recovery was aborted during restbl_rsb_update(), some rsb's could be left on the recover_list. This would cause an oops later when the next recovery sequence used the recover_list. Now we clear that list if we abort the recovery. It seems unlikely people have seen this since it requires a node to fail at just the right time while other nodes are doing recovery. 2005-02-03 Michael Conrad Tadpol Tilstra Partial workthrough of log messages. Some moving, some changing. mostly changed `errors' that gulm can handle to not use log_err Some corrections to the estimated lock space size calculation. Still needs more work, but its closer. Once this gets more accurate will use it for th highwater mark instead of lock count. But not yet. Added out going queues to the local connect in LTPX. Fixes bug 146670. Might need to add read penalties, but works for me without, so holding off on that. Removed a util_* that hasn't been used in quite some time, but was still being compiled and linked in. 2005-02-03 Patrick Caulfield Check quit_threads in a few more places so we don't get blocked when trying to leave the cluster. 2005-02-02 A. J. Lewis o Started working on interactive UI o More interactive updates and fixed inode updating for exhash directories in met o Make '-y' and '-n' options to the fsck work correctly in query() 2005-02-02 Patrick Caulfield If a joining node is removed by the time it has become a provisional member then remember to decrement cluster_members. Default for releasing userland lockspaces is "1", ie get rid on any master locks that do no have local holders. Use #defined constant rather than a plain number 2005-02-02 David Teigland need copytobin 2005-02-02 Ken Preslan Add some more fields to the lockdump print out. When running the "gfs_tool " commands (e.g. lockdump, freeze, withdraw, ...), do better processing of device mapper device names. 2005-02-02 Chris Feist Added slibdir into the toplevel make so we don't touch /usr/lib 2005-02-01 Ken Preslan Probable fix for BZ#146711. Don't try to update the head of the log if we're already shutdown. 2005-02-01 A. J. Lewis o Check for dups when looking at inodes for the first time o Don't mark the resource index and journal index during initialization o Details about new fsck added to FEATURES 2005-02-01 Michael Conrad Tadpol Tilstra Handful of little cleaning things I ran across hunting a bug. Mostly just removing code that was never called. 2005-02-01 A. J. Lewis o Remove a number of unused macros o Make version strings conform to the standard; update usage string Out with the old, in with the new. This is a completely rewritten filesystem checker for GFS. Performance characteristics are significantly improved. The design follows the 5-pass fsck design found in "Fsck - The UNIX File System Check Program" by McKusick & Kowalkski (1994) - http://citeseer.ist.psu.edu/mckusick94fsck.html 2005-02-01 Chris Feist Updated copyright code 2005-02-01 Patrick Caulfield dlm_release_lockspace uses force==1 by default, so the LS gets freed even if there are remote locks mastered on this node (but not if there are local locks). 2005-02-01 David Teigland undo last change (enforce matching nodename/cluster.conf) so we can find a less disruptive way to do it cman_tool now reports an error if the nodename doesn't match exactly what's in cluster.conf. See bz 146320 for more info. 2005-01-31 Lon Hohberger Kill clufindhostname; its functionality isn't used, and can be provided by things like host(1) Clean up lock spaces so we can unload dlm module Use {libdir} for {slibdir} if no slibdir specified Use {libdir} for {slibdir} if no slibdir specifed on configure's command line 2005-01-31 Patrick Caulfield Make heartbeat thread exit whren "quit_threads" gets set. This also mean moving the heartbeat timer control into that thread (quite sensibly) and also clearing wake_flags when membership thread quits. fixes bz#146327 2005-01-31 sunjd Change the header checking for stonith plugin vacm. Tested with vacm-2.0.5a CVS patchset: 4671 CVS date: 2005/01/31 08:06:32 --HG-- extra : convert_revision : 9f770a982ae88f89f0880ef5d12c11b8ff5e5d6d 2005-01-28 Michael Conrad Tadpol Tilstra pending plocks on gulm can now be interrupted. 2005-01-28 Lon Hohberger Merge from RHEL4 branch 2005-01-28 Michael Conrad Tadpol Tilstra fix for bug 146479, was not properly pushing posix range lock information into the VFS. Am now. 2005-01-28 Patrick Caulfield If we get a position JOINACK then ignore any negative ones that come afterwards and we know there's a node that will join us. If it dies for any reason, the timer will restart. This should speed up parallel join somewhat. Don't starve processes that are filling buffers. FIxes bz#143448 2005-01-27 Ken Preslan Fix a bug in a memset(). Don't let stale/invalid data leak up to userspace on a read() from a withdrawn filesystem. Change diaper so it doesn't return errors if the filesystem is already shutdown. This makes things a lot quieter. Don't complain about uninit()ing busy log buffers if we're shutdown. Don't complain if we stall trying to make a log reservation. 2005-01-27 David Teigland There are a couple of potential problems this should fix related to an overlap of dlm_recoverd in ls_nodes_reconfig (changing the ls_nodes list) for recovery event N and dlm_recvd in dlm_dir_rebuild_send (reading ls_nodes list) for recovery event N-1. We now wait, in dlm_ls_stop(), for dlm_recoverd to detect and abort recovery event N-1. This ensures that when all nodes receive dlm_ls_start() and begin recovery event N, that no other nodes are still working on recovery event N-1 in any way. There's still the chance that a stray/delayed message (in particular a RECOVERNAMES request) pertaining to event N-1 will be delivered while all nodes are working on event N. There's an added check that should prevent this from causing any trouble for all practical purposes. One other possible problem is dlm_ls_stop clearing the status bits right before ls_nodes_reconfig sets them. Doing this after dlm_recoverd is suspended makes it safe. I don't know which, if any, of these potential problems have actually been observed; none of them are very likely. But, I think there's a fair chance that the first problem matches bz 145831. 2005-01-26 alan Made Intel's features not part of --enable-all CVS patchset: 4645 CVS date: 2005/01/26 20:01:27 --HG-- extra : convert_revision : f8550ffd32036bccbb6940a2d31c404d79ab5446 2005-01-26 Jonathan Brassow - ccs(7) not ccs(8) 2005-01-26 Patrick Caulfield Don't call nodeid2con() if we're shutting down, it might allocate a new con. There's an off-chance this might fix bz#143449 Change a BUG() into a printk, it's not really /that/ serious an anomoly. Put some more validation on integers passed in from the commandline. 2005-01-26 David Teigland ignore any wakeup that arrives after the serviced thread exits return an error in process_join_stop instead of asserting 2005-01-25 Lon Hohberger Index of error messages Merge from RHEL4 branch: man pages, error documentation + indexing Clean up negative memory-leak testing Merge from RHEL-4 branch: internal test cases, make-check target, actually install man pages Fix memory leaks as a result of not cleaning up libxml2; move test functions into test.c Internal consistency check tests for rgmanager's tree/list. 2005-01-25 David Teigland improve chances that withdraw won't get stuck by: - not trying to stop the lock_dlm threads which might be blocked permanently in gfs - abandoning the locks on the null list (like we abandon all the rest) 2005-01-24 Adam Manthei regex fix for gulm and cman check of /etc/cluster/cluster.conf (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146036) - if ! grep -qE "<[\t ]*gulm[\t ]*.*>" + if ! grep -qE "<[[:space:]]*gulm([[:space:]]|[>]|$)" 2005-01-24 Patrick Caulfield Split the removal of a node out of STARTTRANS as they can get interrupted/overruled and all sorts of things. There is now a seperate NODEDOWN message which, well you can guess what it does. The first node to spot the daed node still starts the transition as before. 2005-01-24 David Teigland Change all log_all() to either log_debug() or log_error() since log_all recently became log_debug. Fix dlm_astd hang in bz 145090. Have dlm_astd skip lkb's in the ast queue that belong to non-running/in-recovery lockspaces. Previously, dlm_astd would block on the lockspace's in_recovery semaphore ahead of the lockspace (and the semaphore) being freed. 2005-01-24 Patrick Caulfield If we get an old STARTTRANS(REMNODE) then still remove the node from our list because someone saw it die. 2005-01-21 Ken Preslan Cleanup. 2005-01-21 Lon Hohberger Add make check, ra-api-1-modified.dtd for validating ra metadata 2005-01-21 David Teigland include time in debug output 2005-01-20 Adam Manthei o fixed broken regex o make gulm check happen on start only 2005-01-20 Lon Hohberger file nfs-tests was initially added on branch RHEL4. Include bonded-ethernet link detection 2005-01-20 Adam Manthei start gulm when --use_ccs is used only if is defined in /etc/cluster/cluster.conf (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=145458) don't start cman if is defined in /etc/cluster/cluster.conf (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=145453) o cleaned up some man of the man pages o removed more fence_rib references o removed fence_rib and fence_racksaver from bin/Makefile Initial cut of drac fencing agent used at Oracle World. Use at your own risk. 2005-01-20 Patrick Caulfield Grr, got me patches & cvs all mixed up again. remove_joiner() now also informs the poor node that it's join has been preempted and it must try again. Return an error from several ioctls if they are called when we are not part of a cluster. 2005-01-20 David Teigland improve an error message 2005-01-19 Daniel Phillips Add index buffer dirtying, block freeing to snapshot delete. More cleanups. Fix bug in excess btree level trimming. 2005-01-19 Michael Conrad Tadpol Tilstra Forgot to update gio. stop being wavery on where tags go. 2005-01-19 Lon Hohberger Fix build bug on slackware 2005-01-19 Patrick Caulfield Remove param from remove_joiner() as that part of the patch hasn't been committed yet. Sorry. Set threads to SCHED_FIFO scheduling policy. 2005-01-19 David Teigland 'fence_tool monitor' will print debug logging from fenced send all debug logging over local socket so fence_tool can monitor at any time while running 2005-01-19 Daniel Phillips BTree snapshot delete refactored and cleaned up, tree collapsing added, a few coalescing bugs fixed. 2005-01-18 Jonathan Brassow - fix the CCS archive -> xml conversion for the gulm section I was converting to the old way the servers were listed, i.e: foo instead of: 2005-01-18 Lon Hohberger amd64 fix 2005-01-18 Jonathan Brassow - better error reporting when there are multiple concurrent updates - fix bug 145393 second ccs update attempt (from another node) fails if update file is /etc/cluster/cluster.conf 2005-01-18 Ken Preslan Unbreak the user tools. 2005-01-18 Patrick Caulfield Another remove_joiner() needed - this time if the new node does not respond to JOINCONF 2005-01-18 David Teigland document new actions wait, status, nodes, services add cman_tool actions status|nodes|services to display /proc/cluster/status|nodes|services 2005-01-18 Patrick Caulfield Check for valid LKB in find_lock_by_id() rather than later on when we've already assumed it's a valid array index. 2005-01-18 Daniel Phillips BTree one-pass delete is now incremental with leaf/node coalescing Phew, what a struggle that was. NB: csnap server currently set up only for unit testing of delete code; will stay broken for a few more days. 2005-01-18 Adam Manthei Apply Derek Anderson's patch to resolve bug 144514: (fence_sanbox2: failure path locks user out of admin mode) Index: fence_sanbox2.pl =================================================================== RCS file: /cvs/cluster/cluster/fence/agents/sanbox2/fence_sanbox2.pl,v retrieving revision 1.1 diff -u -p -r1.1 fence_sanbox2.pl --- fence_sanbox2.pl 24 Aug 2004 16:05:37 -0000 1.1 +++ fence_sanbox2.pl 18 Jan 2005 00:29:06 -0000 @@ -161,6 +161,9 @@ $t->print("show port $opt_n"); if (!((($opt_o =~ /disable/i) && ($text =~ /AdminState\ *Offline/i)) || (($opt_o =~ /enable/i) && ($text =~ /AdminState\ *Online/i)))) { + # Get out of admin mode before failing + $t->print("admin end"); + $t->waitfor('/\>/'); fail "failed: could not change state to $opt_o\n"; } 2005-01-17 Lon Hohberger Fix magma plugin names + installation paths to match LIBDIR/magma Use libdir/magma for plugins 2005-01-17 Patrick Caulfield return error if ioctl(GETNODE) called before we are a cluster member. Remove some redundant code. Don't try to double-free connection if bind() fails. 2005-01-17 David Teigland remove double \n log more info to syslog to help people see why nodes are fenced 2005-01-14 Michael Conrad Tadpol Tilstra relocated some tags to attributes. Easier for the gui to work and we're about about chewy, gui goodness. grab more of the login request for services so if the protocl version mismatches we can print out the name of the server that did this. hopefully help with some debugging. 2005-01-14 David Teigland - don't ignore the completion callback for a canceled lock - extend the info logged for cancel 2005-01-14 Ken Preslan Copyright/GPL. 2005-01-14 Patrick Caulfield Add wait options to cman_tool to help with script synchronisation. cman_tool join -w (join & wait until a cluster member) cman_tool join -wq (join & wait until cluster is quorate) cman_tool wait (wait until a cluster member) cman_tool wait -q (wait until cluster is quorate) 2005-01-14 Ken Preslan Print a better message when versions mismatch. 2005-01-14 Patrick Caulfield Sanity-check the votes, so that expected_votes doesn't get silly. 2005-01-14 Ken Preslan Start using the new generic permission checking code in 2.6.10. 2005-01-14 David Teigland - Leave the lockspace when gfs does a withdraw. For now we abandon all the lock_dlm memory associated with the lockspace. - Fix a memory leak in unmount where nodes list was not freed. 2005-01-14 Ken Preslan Update patches. 2005-01-14 David Teigland fix freeing name string before printing it in syslog in startup check add a loop around the cman GETNODE ioctl since it may return an error 2005-01-13 Lon Hohberger Fix for 44945 2005-01-13 Jonathan Brassow - fix bug with update when absolute path was specified. 2005-01-13 Adam Manthei having init.d/cman load the dlm module makes lon happy o cleaned up init.d script to remove excessive `cman_tool join` and `cman_tool leave` calls. o added a config paramater to control time to wait before deciding that it can't join the cluster (CMAN_CLUSTER_TIMEOUT) 2005-01-13 Jonathan Brassow - add ccs_tool man page - warn every 30 sec (vs 10) if ccsd can not connect to cluster infrastructure. - ccs_tool now handles the update process. - allows for much better (and easier) error reporting 2005-01-13 Patrick Caulfield Add SAF AIS lock API support Contributed by Stanley Wang of Intel. Thanks (and sorry for the delay). Don't commit with debugging enabled. ! If the cluster gets down to 1 node and the last leaver left with "remove" then carry on working. bz#144309 If we get nominated as master, remove any joining node we may have. With luck, this will finally nail bz#133512 2005-01-13 Ken Preslan Some 2.6.10 stuff. Incore printing / profiling / tracing library. Profiling/tracing stuff. 2005-01-12 Michael Conrad Tadpol Tilstra bleh, forgot man page. use lockserver instead of server. less confusing. gulm will now parse the server list from ccsd as instead of a single server line. stop gulm_tool from printing its errors to syslog. fix for bug 144909 clients report quorate=false when master goes away. 2005-01-12 Jonathan Brassow - sync with changes made to RHEL4 branch 2005-01-12 Ken Preslan Fix statfs bug. 2005-01-12 Adam Manthei on shutdown, use gulm_tool to determine node liveliness rather than nodeinfo 2005-01-11 Adam Manthei pid file is now in /var/run/cluster/ccsd.pid, not /var/run/sistina/ccsd.pid syncing w/ RHEL4 branch diff -u -p -r1.1 Makefile --- cluster/fence/agents/bladecenter/Makefile 25 Oct 2004 16:23:03 -0000 1.1 +++ cluster/fence/agents/bladecenter/Makefile 11 Jan 2005 22:38:27 -0000 @@ -19,7 +19,7 @@ include ${top_srcdir}/make/defines.mk all: $(TARGET) -fence_apc: fence_apc.pl +fence_bladecenter: fence_bladecenter.pl : > $(TARGET) awk "{print}(\$$1 ~ /#BEGIN_VERSION_GENERATION/){exit 0}" $(SOURCE) >> $(TARGET) echo "\$$FENCE_RELEASE_NAME=\"${RELEASE}\";" >> $(TARGET) move the ordering of the scripts. lock_gulmd needs to be called before clvm2. S20 ccsd S21 cman S22 lock_gulmd S23 fenced S24 clvmd Add Magma to the list of services that are ignore on shutdown. Without this exclusion, init.d/lock_gulmd will refuse to stop while ccsd is connected 2005-01-11 Lon Hohberger Fix bug that mantis saw; we shouldn't log in with newlines 2005-01-11 David Teigland use die() macro so prog_name is prefixed to error messages yesterday's signal changes broke "fence_tool leave" which requires we watch for SIGTERM 2005-01-11 Adam Manthei o makes usage of ccs configurable o adds parameter to GULM_OPTS for passing commandline opts to lock_gulmd o will not shutdown a server if there are clients logged into it and no other servers are running o reports if the node needs to be fenced before it can log in 2005-01-10 Michael Conrad Tadpol Tilstra print node name and ip on the protocol mismatch error message. added a way to query gulms current running config from gulm_tool 2005-01-10 andrew Optionally suppress and pretty-print compiler output - so its more readable Pretty printing code courtosey of Mike Gleason - NcFTP Software CVS patchset: 4579 CVS date: 2005/01/10 18:04:58 --HG-- extra : convert_revision : e334b6975c5b56897f171f0d24676f087f605672 2005-01-10 Michael Conrad Tadpol Tilstra Added a warning for if you didn't --cluster_name before you --use_ccs. Added a line to the startup banner telling which cluster we belong to. previous method of not calling out to ccsd more than once had issues to say the least. This should be a bit cleaner. 2005-01-10 Patrick Caulfield Clean the queued_messages list at shutdown. Should fix bz#143538 2005-01-10 David Teigland use sigsuspend instead of pause as the reliable way to wait for a signal http://www.gnu.org/software/libc/manual/html_node/Sigsuspend.html#Sigsuspend this should fix bz 133420 change -s to -n in syslog message Use ccs_get_list instead of ccs_get to prevent infinite looping when there's one element in the list. fixes bz 144322 2005-01-07 Jonathan Brassow IF YOU USE CCS, YOU WILL WANT TO READ THIS: There have been changes to ccs_get. ccs_get use to operate as follows: - A query that resulted in a single match would return that match every time. - A query that resulted in multiple matches would iterate over the matches given subsequent calls, terminating with null, and then continuing from the beginning. This made things very hard if you were expecting a list but there was only one match. You ended up waiting for a 'null' that never happened. ccs_get now operates as follows: - Subsequent queries of the same request will simply interate repetitively over the matches - never returning null. A new function has been added, 'ccs_get_list'. It takes the same arguments as 'ccs_get', but operates as follows: - Subsequent queries of the same request will return the matched items, followed by null, then repeat. 2005-01-07 Michael Conrad Tadpol Tilstra I ran into some weird race with having gulm load info from ccsd more than once while ccsd was trying to figgure out if it was in a quorate cluster or not. Solution is not to ask ccsd for info more than once. 2005-01-07 Adam Manthei mention fence_bladecenter in man page for fence_xcat New man page now reads: fence_rib(8) fence_rib(8) NAME fence_rib - I/O Fencing agent for Compaq Remote Insight Lights Out card DESCRIPTION fence_rib is deprecated. fence_ilo should be used instead SEE ALSO fence_ilo(8) diff -u -p -r1.3 Makefile --- Makefile 13 Sep 2004 17:04:45 -0000 1.3 +++ Makefile 7 Jan 2005 18:45:53 -0000 @@ -22,7 +22,6 @@ all: cd manual && ${MAKE} all cd mcdata && ${MAKE} all cd rackswitch && ${MAKE} all - cd rib && ${MAKE} all cd sanbox2 && ${MAKE} all cd vixel && ${MAKE} all cd wti && ${MAKE} all @@ -39,7 +38,6 @@ copytobin: cd manual && ${MAKE} copytobin cd mcdata && ${MAKE} copytobin cd rackswitch && ${MAKE} copytobin - cd rib && ${MAKE} copytobin cd sanbox2 && ${MAKE} copytobin cd vixel && ${MAKE} copytobin cd wti && ${MAKE} copytobin @@ -56,7 +54,6 @@ clean: cd manual && ${MAKE} clean cd mcdata && ${MAKE} clean cd rackswitch && ${MAKE} clean - cd rib && ${MAKE} clean cd sanbox2 && ${MAKE} clean cd vixel && ${MAKE} clean cd wti && ${MAKE} clean fence_rib is deprecated by fence_ilo diff -u -p -r1.1 fence_brocade.pl --- fence_brocade.pl 24 Jun 2004 08:53:13 -0000 1.1 +++ fence_brocade.pl 7 Jan 2005 17:55:40 -0000 @@ -43,7 +43,7 @@ sub usage print " -a IP address or hostname of switch\n"; print " -h usage\n"; print " -l Login name\n"; - print " -n Port number to disable\n"; + print " -n Port number to operate on\n"; print " -o Action: disable (default) or enable\n"; print " -p Password for login\n"; print " -q quiet mode\n"; 2005-01-07 David Teigland Log the name of the node fencing is deferred to or say "prior member" if the name isn't known. fixes bz 144170 2005-01-07 Patrick Caulfield Make all nodes print a message saying why another node left the cluster, so that tracing a fence is a little easier. bz#144386 2005-01-06 Patrick Caulfield If bind() failes then NULl con->sock too, so the tidy up doesn't cause an oops. fix for bz#144144 Move find_minor_from_proc before it's first use. Don't loop ourself to death if we run out of memory. 2005-01-06 andrew New crm regression test CVS patchset: 4557 CVS date: 2005/01/06 09:35:37 --HG-- extra : convert_revision : bd1bc87f4029a4817570cfedec5b053cfd63eb6e 2005-01-06 Jonathan Brassow - fix for bug #137021 ccs doesn't find most recent cluster.conf This boiled down to a timing issue. ccsd processes request serially (broadcast, or otherwise). When cman_tool was started, it issued a connect. While not yet quorate, ccsd must broadcast to see if there are any more recent versions of the config file. This happens as part of the connect. select is used to set a timeout on just how long it waits for replies. 2 problems were encountered. 1) the timeout was not being properly reset after the select returned. 2) everyone uses the same timeout What results is that if a connect is issued simultaneously on every node, they first try to process the connect - then any broadcast requests. Because they all have the same timeout, they never recieve broadcast responses from their peers (because they are also stuck processing connects). The current solution is to add a random component to the timeout and make sure to set the timeout properly after the select returns. 2005-01-05 Jonathan Brassow - forward port bug fix 133254 2005-01-05 Benjamin Marzinski Fixed gnbd_export remove error message for bug 143131. Now it will print the correct errno. Note, this doesn't fix the bug. 2005-01-05 Michael Conrad Tadpol Tilstra Wasn't calling ccs_disconnect() when reading config from ccs. Do that now. Fixes rbz144286 2005-01-05 Ken Preslan Reworked gfs_printf(). Add new file. Update patch. 2005-01-05 David Teigland Release rsb's semaphore before queue_ast() since queue_ast() can free the rsb at any time. Patch from Daniel McNeil 2005-01-05 Ken Preslan Get rid of -P option. It should be replace with CLVM labeling later. 2005-01-05 Jonathan Brassow - if a lockfile can not be created, print that ccsd is already running. 2005-01-04 Jonathan Brassow - some cmirror changes that have been sitting around. - report better errors (especially in the event that magma plugins aren't found) 2005-01-04 Lon Hohberger Report a usable reason when clu_connect fails. Kill a bunch of assertions. Return EINVAL when we try to access functions from a nonexistent plugin Put in decent errno values for cp_connect so callers can find out what happened when it failed. 2005-01-04 Patrick Caulfield NULL some pointers when we shutdown. Use sock_create_kern() rather than sock_alloc() for 2.6.10 2005-01-04 Ken Preslan o Changed GFS ioctls so purely incore binary structures aren't used. This should allow mixing of different versions of gfs userspace and kernelspace without too much trouble. o Update the user tools that use those ioctls. o Cleaned up gfs_tool. 2005-01-03 Ken Preslan Dave's changes to allow "gfs_tool [freeze|unfreeze|withdraw]" to work on CLVM devices. 2005-01-03 Lon Hohberger Fix segfault if rgmanager wasn't running 2004-12-23 Michael Conrad Tadpol Tilstra So here is a bit of code that finishes up the withdraw support for gulm. It abuses the way callbacks on locks work to give something that can alert other mounters that a journal needs replaying. It works. I don't like it. It feels kludgy. I am going to see if I can come up with something beter, but this at least will let people play with withdraw in a cluster. moved a message from Always to locking. 2004-12-23 Jonathan Brassow - add ccs_tool to Makefile This will likely switch from a script to a C-program soon, as functionality is added. 2004-12-23 Adam Manthei Jon changed the verbosity of ccsd. -v doesn't need to be on by default anymore. 2004-12-23 Jonathan Brassow - remove log_msg_always - make log_msg always print - remove -v option I was making the mistake of thinking that the user should control the daemons output by using (or not using) -v. However, the more correct way is to print everything to the log and let the user sort through it by defining syslog preferences. 2004-12-22 Adam Manthei make the daemon more verbose in the script since the default logging is not very helpfull. 2004-12-22 Michael Conrad Tadpol Tilstra updated the patch for last lock_gulm.ko commit. Most of the witdraw code. Still lacking a bit to notify other mounters that the withdrawn journal needs replaying. But at least people can start playing with the withdraw and not assert. code to allow expiration of a subset of locks. This will be used by the witdraw code. 2004-12-22 Patrick Caulfield Make MAX_RETRIES /proc settable. Make some KILL messages reliable. Don't hold the res_lock for quite so long. Well, it works for Daniel McNeil. 2004-12-22 Ken Preslan Fixed a reference leak of lock module in-use counts that was caused by filesystems withdrawing. Fix a case where the resource index isn't propery released on error. 2004-12-21 Jonathan Brassow - fix bug 128662, ccs_test connect : connects to local ccsd regardless There was a remaining end case with this bug. A machine could broadcast, forcing all other nodes to load their config files. So, if you wanted a different config file on another machine (besides what was local at the time of the broadcast), you would have to kill and restart ccsd. This is no-longer necessary. - fix bug: 128422 (ccsd grabs network config silently if local is bad) Now, if the local config file is bad, ccsd errors out - giving the user the output from the parser as well as the steps to take to fix the problem. 2004-12-21 Adam Manthei remove /dev/misc/dlm-control if the major/minor number of the device node conflist with the kernel assignments as reported by /proc/msic. This resolves bug #138491 2004-12-21 Patrick Caulfield Be a lot quieter on the console If a transition gets usurped by another node, always tidy up old joining nodes. Also, prevent a CHECK transition from usurping a real one as that's just a waste of everyone's time. 2004-12-20 Chris Feist Added install of init script in top level Makefile. Fixed a problem which wouldn't uninstall the initscript. Updated toplevel Makefile to install init script. Added install/uninstall to main Makefile for init script. Added option to install init scripts. 2004-12-20 sunjd update for newly ported OCF RAs CVS patchset: 4527 CVS date: 2004/12/20 16:21:06 --HG-- extra : convert_revision : 235c7964ffbce522b50ef57d22225ef9f3f72c72 add newly ported OCF RAs CVS patchset: 4526 CVS date: 2004/12/20 16:19:37 --HG-- extra : convert_revision : c6ff09d88b8bb4d5625ae47b0df8fd4753919100 add exporting for 'info' msg CVS patchset: 4525 CVS date: 2004/12/20 16:17:19 --HG-- extra : convert_revision : 84f50a09ba4867c941392b82ed21ac15712ece50 add newly ported OCF RAs CVS patchset: 4524 CVS date: 2004/12/20 16:14:35 --HG-- extra : convert_revision : 25935226990f0d71067a91e2f7ac22cc57c74c22 2004-12-20 Jonathan Brassow - minor updates to ccs_tool 2004-12-18 Ken Preslan Add code to implement "gfs_tool withdraw /mountpoint". o Add facility to stop all block I/O from a filesystem by setting a bit. The number of outstanding I/O is also kept track of, so it's possible to wait until all in-flight I/O has finished. o Used this new code to allow GFS to abort a filesystem at arbitrary points in time. GFS sets the bits to stop I/O, waits for all in-flight I/O to stop, and then calls a new lock module function, lm_withdraw, to ask the lock module to leave the lockspace. (Basically, perform all recovery steps for the filesystem except fencing.) o Switch the places that see unrecoverable I/O errors or filesystem consistency problems to withdraw from the cluster instead of panicking. o Get rid of most of the assert calls that panic the system in favor of printing warnings or withdrawing from the cluster. There is still some ongoing work that needs to be done. Mostly in the area of making sure that resources are freed appropriately in error cases that (now) don't panic the machine. o At the moment, it's usually possible to cleanly unmount the filesystem after it has withdrawn itself from the cluster. There are still some cases where the filesystem is unmountable or the withdraw causes oopses. These are bugs that can and will be fixed. o The lm_withdraw() functions for lock_gulm and lock_dlm still need to be written. Right now they just panic the machine. :-) 2004-12-17 Jonathan Brassow - fix a problem where if the working path was set (via ccs_set_state()), lookups could fail. 2004-12-17 Adam Manthei initial stab at init.d script 2004-12-17 Jonathan Brassow - fix bug 143165, 134604, and 133254 - update related issues These all seem to be related to the same issue, that is, remote nodes were erroneously processing an update as though they were the originator - taking on some tasks that didn't belong to them. This was causing connect failures, version rollbacks, etc. - fix a problem with ccs_tool update which was causing a connection to be left open. 2004-12-17 Michael Conrad Tadpol Tilstra - lock tables are supposed to register themselves are LT%03d, not %03d 2004-12-16 Jonathan Brassow - add log_msg_always macro, which prints a msg of priority NOTICE regardless of whether -v was specified. - Try to give reasons for update failures. - replace instances of sistina with "cluster" (e.g. /var/run/sistina -> /var/run/cluster) - add beginning of update/upgrade tool 2004-12-16 Michael Conrad Tadpol Tilstra - SIGTERM is ignored. Update man page describing this. 2004-12-16 Patrick Caulfield Fix a few join related bugs. Many of which are related to #142853 and # 1335212. I'm not fully convinced that these are all sorted yet though. Also, when we get a "leave remove" message, try to reduce quorum far enough so that activity continues. 2004-12-15 Lon Hohberger Make script actually work. Fire wall ports are corrected, but the iptables commands need revisiting. Unbreak init script Display all cluster members if we're not part of the RG manager group. src/daemons/Makefile: Remove debugging compiler flag - include/resgroup.h: Remove unnecessary states/requests. Add handle_start_remote_req, rg_doall, and svc_status protos. Add FORWARD define for when a request can be carried out - just not locally. - include/reslist.h: Add protos for res_condstop/res_condstart and resource_delta/resource_tree_delta functions - src/clulib/rg_string.c: Change to match states/requests in resgroup.h - src/daemons/groups.c: Don't even bother evaluating disabled resource groups. Handle CONDSTOP and CONDSTART requests in the script exec path for changed resources. Handle online resource and group configuration changes (e.g. after a SIGHUP). Add rg_doall for queueing an event on all resource groups and optionally waiting for completion. - src/daemons/main.c: Handle SIGHUP - flag for reconfiguration. Move status checks queueing in to the main loop (used to happen automatically from within rg_threads -- this caused problems with reconfiguration). - src/daemons/restree.c: Fix misc bug preventing resources without a recover function being treated as though they had a recover function. - src/daemons/rg_locks.c: Rename __ccs_mutex to _ccs_mutex so as not to upset glibc developers. - src/daemons/rg_state.c: Add svc_advise_stop to clean up _svc_stop function a bit. Misc cleanups in the start/stop functions. Move report of started RG into svc_start so it doesn't appear multiple times in different places (causing appearance of a RG being started twice on one node...). Add svc_status function as part of rg_thread cleanup. Return FORWARD from svc_stop/etc when we don't own the resource group. - src/daemons/rg_thread.c: Change thread model to have worker threads exit for a given RG instead of hanging around. This greatly simplifies the code paths and reduces memory consumption when running mlocked. - src/utils/clustat.c: Change reporting of "no members" to "this node". Just because one node hasn't joined a SG doesn't mean others haven't. 2004-12-15 Patrick Caulfield Don't use a large(ish) static buffer for the membership state. Might fix bz#142865 2004-12-14 Ken Preslan More comments from ben.m.cahill@intel.com. 2004-12-14 Benjamin Marzinski oops. switched long variable to unsigned long so that the gnbd device size doesn't get screwed up on large devices. fixes rbz 142870 2004-12-13 Lon Hohberger Fix up circleping to work properly. 2004-12-13 David Teigland When rebuilding an lkb on a new master, the lockqueue_flags value was being skipped. This could cause many problems in just the right circumstances, although no actual bug reports are an obvious case of this. 2004-12-11 David Teigland add sections on setting panic_on_oops and sources of more info document the optional nodeid setting 2004-12-11 Jonathan Brassow - Add message (that prints when -v flag is used with ccsd) that reports when the cluster is quorate and will allow connections. - Change "cluster not quorate. refusing connection." back to log_msg from log_err. This is done since fenced will now often query ccs before the cluster is quorate. It is not an error, it's just waiting for the cluster to become quorate... - Fix a problem Dave was seeing. If you do a ccs_test connect, before starting ccsd, it will appear to stall indefinitely while doing retries. The function in the ccs library responsible for talking to ccsd must setup a connection from a privilaged port. Sometimes, the first port it tries to bind to may be in use, so it move on to the next. This is necessary. However, I was not properly checking the error return codes. So, an ECONNREFUSED was being interpretted as "Hey, try another port in a little bit." Only, after erroneously trying all the privilaged ports does it error out - which is why it appears to stall indefinitely. 2004-12-10 Ken Preslan TASK_INTERRUPTIBLE -> TASK_UNINTERRUPTIBLE. Munge. Get rid of the kernel patch. Not a battle worth fighting. Sigh. 2004-12-10 Lon Hohberger Fix relocate on more-preferred-member-boot; fix minor bugs in building alloc.c 2004-12-10 David Teigland - slight correction in description of expected_votes - leave votes="1" out of standard node descriptions - add a section on specifying node votes A node gets 1 vote by default when no votes value is specified. Remove votes="1" in the examples since it's unnecessary and we don't specify any other values when using the default. add -D debugging option add debugging output for -D and a check to see if ccs connect works 2004-12-10 zhenh add OCF IPv6addr CVS patchset: 4480 CVS date: 2004/12/10 06:04:10 --HG-- extra : convert_revision : 20d98f02871b50ede755119ccba4c3b51c8f8912 the OCF version IPv6addr CVS patchset: 4479 CVS date: 2004/12/10 06:03:33 --HG-- extra : convert_revision : 644b329c76da8b668a04f93133b411d1200f2e85 2004-12-09 andrew New test to generate CVS patchset: 4473 CVS date: 2004/12/09 14:54:39 --HG-- extra : convert_revision : db24206fefce80d170734d2df46b2ab755afed17 2004-12-09 Patrick Caulfield lkb_dequeue s/b res_lkb_dequeue 2004-12-09 David Teigland prefix output with prog name 2004-12-08 Daniel Phillips Whoops. - Fixed struct client pointer stability bug in server too - Merged Ben's lock handling functions Fix struct client pointer stability bug in server too 2004-12-08 David Teigland update from src files update from src files use dlm alt modes for LM_FLAG_ANY Two new flags that can be used for gfs's ANY flag: ALTPR and ALTCW. ALTXX means a lock is requested in XX mode if the ordinary request mode cannot be granted. If neither can be granted, the ordinary mode applies. The ALTMODE flag is returned if the alternate mode was granted. 2004-12-08 Daniel Phillips Turn off tracing, remove server crash simulation - bug fix: remove premature optimization that breaks when blocksize < chunksize - Add client in-flight retry on reconnect or failover - fix synchronizing bugs in kernel failover - Add a level of indirection to client vector so array compaction on client disconnection works properly (Ben had a different fix for this) - agent: don't retry connect if server connection fails, wait for new server 2004-12-07 David Teigland slight correction on how leave remove works 2004-12-06 Ken Preslan Got rid of some unused LM flags. Fix bug #141821. KNFSD could call into GFS to do a lookup on ".." of the root directory. GFS wasn't handling this correctly. 2004-12-06 David Teigland minor update don't leave if gfs is mounted Re-order some initialization so that ccs errors will be caught and reported before fenced forks into the background. 2004-12-05 andrew Allow the CRM library(s) to be compiled CVS patchset: 4433 CVS date: 2004/12/05 19:28:52 --HG-- extra : convert_revision : 29ba921ffcf9e27540777d3801252a18786339d6 2004-12-05 sunjd update for the new node fencing subsystem CVS patchset: 4426 CVS date: 2004/12/05 17:28:30 --HG-- extra : convert_revision : be839d1d90e82e32f417f17bf4cf5aac40d8df7a 2004-12-03 Ken Preslan More excellent comments from ben.m.cahill@intel.com. 2004-12-03 Patrick Caulfield Don't send a KILL message if a node has the wrong generation number, try to work it out amicably with a CHECK transition instead. Add send/recv API to libcman, and some comments to libcman.h 2004-12-03 David Teigland clean up join_ccs code 2004-12-03 Ken Preslan D'oh. Missed some files. o Add "oopses_ok" mount option. This allows GFS to oops instead of panic on assertion failures. This will hopefully allow easier debugging in some situations. But, using this option will make it possible for a GFS oops on one machine to forever stall the filesystem on all other machines in the cluster. Use will *extreme* care. o Start working utility functions to allow better error handling. 2004-12-02 Benjamin Marzinski added the stack overflow fix to head (rbz139863) gfs_glock_nq_m(), nq_m_sync(), and gfs_glock_nq_m_atime() no longer have arrays on the stack. 2004-12-02 Ken Preslan Declare the RO array in gfs_sort() staticly instead of dynamicly on the stack. :-( Fix a spot where we weren't propagating away errors. Update. 2004-12-02 Patrick Caulfield Make sure transitionreason gets set if inherit mastership from a dead master node. 2004-12-02 Michael Conrad Tadpol Tilstra add servicelist to man spell generation correctly in prints. add servicelist to usage() reoder lock stats. 2004-12-02 David Teigland - update cluster startup tips, including new fenced delay/options - include cluster.conf update procedure from a linux-cluster mail add fence_tool Add an msleep(500) to dlm_recoverd in an attempt to avoid the same invalid wakeup we've seen before. add fence_tool man page A bunch of improvements: - Wait for the cluster to be quorate before starting or killing fenced. Waiting here lets the join/leave be cancelled easily (not the case if it blocks in SM). - Unfence ourselves before joining the fence domain. Can be skipped with -S. - Accept the c/j/f fenced options and pass them on to fenced when we exec it (since fence_tool is the usual way of starting fenced.) - Get rid of -t timeout option which isn't useful now that we have the c/j/f options for fenced. ignore the S option that fence_tool uses and may be inherited with the other args 2004-12-02 Patrick Caulfield Clear joining_node after a client-end transition 2004-12-01 David Teigland only do lvb recovery when a node is removed from the lockspace 2004-12-01 Patrick Caulfield Tidy up the node_id assigning code. With the changes made a while ago, a lot of this has become redundant and has been replaced with something much smaller and tidier. With luck it'll work too. 2004-12-01 David Teigland lvb sequence number should be incremented by dlm_unlock 2004-12-01 Daniel Phillips Turn off tracing output, remove extraneous debug output - Csnap server now has journalling and recovery, it seems to work - The buffer layer is now a separate compilation unit, Alasdair should be happy. - Kernel client has failover recovery and lock upload support - Miscellaneous bugfixes 2004-12-01 David Teigland DEBUG2 should be off by default Use sequence numbers to restore the most recent lvb copy during recovery when none of the locks have a definitive copy. zero the lvb when it's invalid since the dlm will not necessarily do that any more new cluster.conf tag names another tag name change 2004-11-30 Jonathan Brassow - updated fence_devices/device, but forgot nodes/node - updates to reflect change in cluster.conf tags - updates to reflect tag name changes in cluster.conf - update docs to reflect tag name changes in cluster.conf - updates for new cluster.conf tags - update example cluster.conf to reflect changes in tag names 2004-11-30 Chris Feist Added fence_bladecenter.8 & fenced.8 to the install script for fence manpages. 2004-11-30 Lon Hohberger Reflect jparsons's changes to cluster.conf structure 2004-11-30 Patrick Caulfield header->flags needs to be byteswapped since I made it an int. 2004-11-30 David Teigland begin ccs requests with /cluster instead of //cluster ccs request began //nodes/ instead of //cluster/nodes/ Both work but the later is consistent with the other uses. missed clearing VALNOTVALID flag in unlock case Add -u option to unfence the node. Does nothing if unfencing isn't supported by the agent. - add code to support unfence option - change default post-join-delay to 6 sec more helpful error message 2004-11-25 Jonathan Brassow - remove some print statements and refine todo list 2004-11-24 David Teigland don't include old license dir munge usage wording Fix some problems in the recently added lvb and valnotvalid flag recovery. add content Update per recent changes and new preferred option: -n While waiting for a manual ack, also check to see if the node has rejoined the cman cluster -- if it has, take that as an ack. This new check is a no-op when using gulm. nodename="hostname" is now the preferred node-specific manual fencing data instead of ipaddr="hostname". (The ipaddr form is still recognized.) Similarly, fence_manual -n and fence_ack_manual -n . (-s is still recognized for fence_ack_manual) (There has been some discussion about a possible new name for this agent, although fence_manual would remain an alias for any new name.) 2004-11-23 Ken Preslan More comments from ben.m.cahill@intel.com. 2004-11-23 Patrick Caulfield Tidy the language, fix a few typos, and update it a little. 2004-11-23 David Teigland remove warning and a couple unnecessary type casts 2004-11-22 Jonathan Brassow - make the "in_sync" flag log based (rather than module-based) this way, different mirror sets can have their own state of in-sync or not. - fix bug that allowed different nodes to have different views of who the master is. 2004-11-22 Ken Preslan When updating the atime, don't demote the glock back to shared unless someone else is asking for it. 2004-11-22 Patrick Caulfield Make command-line options override CCS rather than the other way round. That really annoyed me. Remove some redundant code. Tweaks to init script 2004-11-22 David Teigland The VALNOTVALID flag was being incorrectly cleared during recovery in cases where it was already set prior to recovery. The lvb wasn't being copied into the master-copy lkb on unlocks, so when the unlock entailed an lvb write, an old value was used. This solves the problem of gfs's cached stat values being wrong. 2004-11-20 Jonathan Brassow - a little clean-up - basic optimization (test) 2004-11-19 Lon Hohberger Re-enable support for specifying target nodes for enable/disable/etc. - Read configuration data before joining the service group - Forward disable requests to the correct member 2004-11-19 Ken Preslan Get rid of unneeded return value. 2004-11-19 Lon Hohberger Add checks to make sure the nfs daemons are running. 2004-11-19 Ken Preslan o Clean up some memory allocation code o Munge comments Clean up the statfs code some. 2004-11-19 Patrick Caulfield If a nodes dies after beiung sent a JOINCONF then remove it from all nodes in the cluster, not just us. Might fix 133512 Clean joining_nodeid in a few places. Also mark a joining nodeid dead before cancelling a transition. Should (probably) fix 139958 2004-11-19 David Teigland Allow delays of -1 to indicate forever. Three config options for fence deamon can be set in cluster.conf or on command line. The "delay" option added earlier is replaced by "post_join_delay" and the default is 3s. It's the length of time we would wait after adding a new node before fencing any victims (if there are any.) This helps to avoid spurious fencing on cluster startup. "post_fail_delay" is the number of seconds we will wait before fencing a failed node. This provides a chance for it to rejoin the cluster and avoid being fenced. The default is 0s which is equivalent to past behavior. "clean_start" if 1 indicates that all nodes should be assumed in a clean state to start. Any nodes in an unknown state when the fence domain is formed and enabled will not be fenced. This can be dangerous to enable. The default is 0. These can optionally be added to cluster.conf in the section 2004-11-19 Ken Preslan Many more good comments from ben.m.cahill@intel.com. 2004-11-18 Michael Conrad Tadpol Tilstra switched = to += so I can add paths to LIBS if a plock IS_SETLKW, then it is a *blocking* request. 2004-11-18 Patrick Caulfield Change dead_node_lock to a spinlock and don't hold it for nearly as long. Don't cancel join if we get a WAIT after an ACK. This speeds up bulk rejoins quite a lot and also gets rid of some scary messages. Add a BUG() and printks that may help in tracking down 133512 2004-11-17 Chris Feist Fixed problems w/ make uninstall not working w/ rgmanager. Uninstall script didn't uninstall gnbd.h properly. 2004-11-17 Michael Conrad Tadpol Tilstra The compiler is generally better at knowing when to try to inline functions than I am. And it gets confused when I am wrong. 2004-11-17 Ken Preslan Missed a use of the async flag that was just removed. o Fix things so a freeze call requesting the transaction lock can't cancel a recovery process requesting the transaction lock. o All our lock modules are now asynchronous. Get rid of support for sync modules. Get rid of unneeded file. 2004-11-17 Patrick Caulfield Fix some comments Add a compatibility layer (conditionally compiled in) for using a 32bit libdlm on systems with a 64bit kernel. 2004-11-17 David Teigland Add more intelligence to fence daemon to avoid fencing nodes when we detect conditions where it might be likely. When fence victims are added following a node /joining/ the fence domain (rather than the usual of victims added following a node failing), it's a good clue that other nodes may be in the process of joining the cluster and therefore unfairly marked for fencing. When this is detected, delaying for a configurable number of seconds before doing the actual fence operation can allow victim nodes some time to join the cluster and avoid being fenced. The default delay in this event is now at 3s, but increasing it to 10s may be better. It can optionally be set in cluster.conf by adding: 2004-11-17 lars Large file support enabled (logfiles >=2GB on 32bit archs). CVS patchset: 4345 CVS date: 2004/11/17 10:11:37 --HG-- extra : convert_revision : 693b4a74fa46131795901655635b342242076cd7 2004-11-17 Jonathan Brassow - add cmirror target - fix rgmanager referencing magma-plugins/make/release.mk.input for the tarballs target - forgot the uninstall script - cluster mirror - fix seg fault that occurs when querying ccs with desc that is too large rbz134608 2004-11-16 Ken Preslan o Fix bug #133368. Change the code to trigger calls to lm_cancel() on LM_FLAG_PRIORITY requests instead of LM_FLAG_NOEXP requests. o Don't call lm_cancel() as often. 2004-11-16 Lon Hohberger - Add support for "-o" option to control on/off/reboot. - Add support for RPC/TPS/IPS series switches from WTI. Use -n instead of -p for port number. Ignore case for options read from stdin. Actually handle -v argument. Add support for the widely-used RPS-10M-HD modules. (2 node clusters ONLY) 2004-11-16 Ken Preslan Formatting munge. 2004-11-16 Patrick Caulfield Fix broadcast. Fix debug print Undo last check-in for this file that came from a bogus tree. wrists have been slapped. 2004-11-15 Patrick Caulfield Make the API a little more consistent. nodeid==0 always means (this node) and GETALLMEMBERS can be called witha NULL parameter to get the number of nodes. use GET_ALL_MEMBERS to return number of nodes in the cluster as it's consistent with the get_members call. 2004-11-15 David Teigland We should wait for the fence domain join to complete before allowing a gfs mount, and not allow a mount if the node is in the domain but leaving. Bug in names_equal() caused nodes to never be identified as "first victims". Now that this is fixed, people may start noticing their nodes being fenced when they're not in the cluster and a new fence domain is started. 2004-11-12 David Teigland fix can_avert_fence() function, was using cman api incorrectly 2004-11-12 Patrick Caulfield Udev script (goes in /etc/udev/rules.d) for creating DLM devices. 2004-11-11 Lon Hohberger Clean up bad mallocdbg stuff 2004-11-11 Patrick Caulfield Make the default lockspace AUTOFREE so that it gets deleted when the last user closes it. Add a flag to userland lockspaces that will cause them to be deleted when the last user closes the device. Get rid of suprious ASSERT in dlm_unlock that broke cancellations. Remove locking from a routine that is only ever called from ASSERT, it only causes the ASSERT to deadlock without producing any debug info :) Decrement the module count if returning -EEXIST from dlm_new_lockspace(). 2004-11-10 Patrick Caulfield Fix refcounting error. Return the LVB "INVALID" state to users when the LVB has been invalidated. Byteswap (if appropriate) new rl_flags field. When comparing node states, "JOINING" is effectively the same as "DEAD", so don't screw up a cluster transition for that small semantic difference. Should close #133512 2004-11-10 David Teigland add a log_debug line freeing a value before using in debug print add an extra line to syslog to help track what's happening 2004-11-09 Chris Feist Fixed name of tarball to install. Created a .PHONY tag for srpms. No error if srpms already exists. 2004-11-09 Patrick Caulfield Don't try to reconnect when we get EOF on a socket. 2004-11-09 Ken Preslan Add some locking to the recent list and forward pointer when tearing down the RG list. It's not strictly necessary, since there will be no one allocating during the tear-down. But... Fix my stupidity. o Fix bug #135684. Change code to make sure a RG is part of the "recent" list before removing it from that list. Thanks to alexander.laamanen@tecnomen.com o Other allocator cleanups. 2004-11-09 Chris Feist Updated location of srpms. 2004-11-08 gshi implemented logging daemon The logging daemon is to double-buffer log messages to protect us from blocking writes to syslog / logfiles. CVS patchset: 4268 CVS date: 2004/11/08 20:48:36 --HG-- extra : convert_revision : c1b41347cfd1f232dbb0ae8881fdb5b8465937ee 2004-11-08 Patrick Caulfield Remember to assign parent during RSB rebuild. Tidied up the userland/kernel interface so that all the data transfer happens in the read/write path rather having some copy_*_user side-effects. Also tidy the dlm_*_wait() calls so they share code with the normal calls. This breaks kernel/userland compatibilty. You must upgrade both kernel and userland (libdlm) together! Move lowcomms_close() outside of the spinlock, as it may want to sleep. 2004-11-08 David Teigland these changes should fix the rare problem of waking a recoverd thread that's already exited 2004-11-05 Patrick Caulfield Only overwrite the user's LVB if the lockop has changed it 2004-11-04 Michael Conrad Tadpol Tilstra Fix lock_gulm.ko for nonSMP kernels. Silly me, spinlock_t has no size when nonSMP. Tried to malloc an array of zero length. Don't do that any more. Thanks to Graham Wood for helping me hunt this one down. 2004-11-04 Chris Feist Moved specfiles to proper name. Moved GFS-kernel.spec.in to gfs-kernel.spec.in Fixed directory locations Added spec file. Fixed dlm.spec.in Added spec file for dlm-kernel Added spec file for fence 2004-11-04 David Teigland Set the VALBLK flag on the NL locks used for gfs's hold_lvb. Not mixing lvb/non-lvb locks makes the recovery effects more sensible although no different in the end. Allocate an lvb for a new master rsb during recovery if any VALBLK locks exist, even NL. This is a minor incremental change in lvb recovery. 2004-11-04 Patrick Caulfield lowcomms_close can be called when atomic, so we can't use nodeid2con (which uses a semaphore). Get the connection directly instead. This should be safe because lowcomms_close is called during early recovery. Keep a local copy of cbinfo->isoob as cbinfo can be freed before reaching the while -- causing an oops. 2004-11-03 Chris Feist Changed srpm building location. 2004-11-03 Lon Hohberger Fix changes WRT bash 3. Use the 'ip' command instead of ifcfg for now. Don't arping if ipv6. No longer needed. Don't free old block if realloc fails 2004-11-03 David Teigland update from src files remove 2.6.8.1 patches missed a couple spots when removing rcom stats 2004-11-02 Lon Hohberger Clear out references to mallocdbg. Remove mallocdbg files. Fully clean up memory before exiting. Make sure we free *everything* we allocate. Add fixed-size alloc/free file. Remove mallocdbg references. Remove mallocdbg files + refs. Give ourselves ways to free/clear out all internal state (msg_shutdown(), clist_purgeall(), etc.). 2004-11-02 Michael Conrad Tadpol Tilstra - get lock_gulm.patch uptodate with source. 2004-11-02 Patrick Caulfield Add some locking around queue traversal. 2004-11-02 David Teigland Add a semaphore to serialize recovery with mount-group portion of unmount. This avoids a potential hang during unmount where both the recovery and unmount threads believe the other will call kcl_start_done() to complete the recovery. Unlikely this has been seen in practice, but possible. remove rcom debugging code that we're not using allow max nodes to be set from /proc/cluster/lock_dlm/max_nodes 2004-11-02 Chris Feist Fixed variable in rpm spec file. Added rpm spec file. Fixes in the script to build srpms. Fixes in the misc. spec files. 2004-11-01 Chris Feist Added rpm spec file for gnbd Fixes to the scripts for building SRPMS. RPM spec file for magma. Added a script for building srpms from the individual components. 2004-11-01 Michael Conrad Tadpol Tilstra By default, gulm determins the name of the machine it is on with gethostbyname(). Then it looks up an ip from this name to use as its IP. There are now three ways to override these defaults. --name will use the name given instead of what is in gethostbyname(). --ip will use that address for this machine. (don't be stupid and set it to a value that isn't configured to this machine.) And --ifdev will lookup the ip from the configured network device. --ip overrides --ifdev. These options are only available form the command line options. Fix __you_cannot_kmalloc_that_much by switching the large kmallocs to vmallocs. 2004-10-31 Daniel Phillips Added kernel snapshot read lock upload for failover Added kernel worker daemon recovery for failover 2004-10-29 Michael Conrad Tadpol Tilstra Allow gulm services to lock core. Locked core will ignore the shutdown requests. gulm services that lock core must logout to unlock core. currently anything from kernel space locks core. Fixed an end case weirdism. Given a cluster with three servers, and at least two clients. Kill the Master, a Slave, and all but one client. Wait for the remaining Slave to become Arbitrating, and completely fence all dead nodes. After all killed nodes reboot, restart lock_gulmd as concurrently as possible. The journals for the mounted clients will not get replyed. The Problem. The lock_gulmd on the Slaves log into the Arbitrator, Arbit becomes Master. Lock_gulmd on killed Clients logs into Master before lock_gulmd on old Clients. This makes the bit of info about the killed Clients being killed get lost. So the old Clients do not see this when they return. So they do not know that there are journals that need replaying. Once the new Clients mount, they replay and things continue. If they do not mount, portions of the file system are blocked by expired locks. The fix. The current design of the cluster manager portion of gulm makes this overly complex. Without changing the design, either a new daemon needs to be written, whoes purpose is simply to sweep the lockspace when nodes expire. Looking for the locks with lvbs that are storing jid mappings and twisting them so that the module knows to replay them. New daemon is icky, so not doing that. Changing the design of gulm at this point is icky too. Gulm is being maintained because it is a known code base. Changing the design would break this. So not doing that. Since all that needs to be done is to fiddle one byte, given known values of other bytes in a lock, I just added a function to the lock server code. A bit kludgy putting it there, but it is the least intrusive solution. 2004-10-29 Jonathan Brassow - ensure that the config_version is an integer 2004-10-29 David Teigland allow drop_locks_count of zero to disable drop callbacks altogether 2004-10-29 yixiong Added a check for the hbversion. If the heartbeat version is greater than "1.9", a "HAVE_NEW_HB_API" is defined. This enables the snmp subagent to maintain a single source tree for both the main branch and the STABLE_1_2 branch. CVS patchset: 4230 CVS date: 2004/10/28 23:26:37 --HG-- extra : convert_revision : dcd0126ce4e91769d6ef5511d9cfc36d77438077 2004-10-28 Ken Preslan More comments from ben.m.cahill@intel.com. 2004-10-28 yixiong Fix the config summury regarding cms when it is not enabled. CVS patchset: 4225 CVS date: 2004/10/28 21:29:12 --HG-- extra : convert_revision : 2526f06aed56f5404eaadf5665a1410d408c9749 2004-10-28 Chris Feist Fixed configure script so it would configure rgmanager. 2004-10-28 Jonathan Brassow - change a log_msg to log_err, so that the error is printed even if the user does not specify the -v - fix problem that generated a circular dependency 2004-10-28 Patrick Caulfield Example of using libcman First cut of a libcman - comments welcome Add userland API to get the cluster name/ID Tidy the userland API so it takes nodeid 0 to mean "us" 2004-10-28 David Teigland Cached null locks that had been used with plocks were being freed too early, before the the unlock completion ast. 2004-10-27 Ken Preslan Fix minor permissions issue. 2004-10-27 Lon Hohberger Prune tree when a child type is not allowed 2004-10-27 Ken Preslan Fix bug #126952 Now, GFS freezes and unfreezes a filesystem by reading and writing to /proc/fs/gfs instead of doing ioctl()s directly on GFS files. This fixes a bug where a process trying to unfreeze a filesystem would end up blocking behind another process holding a lock on the mountpoint and waiting for the filesystem to be unfrozen. This bug has always been present, but it's much more visible in the 2.6 code. Note that the arguments to "gfs_tool [un]freeze" need to be exactly what's found in /proc/mounts now. Also, "gfs_tool margs " replaces the old way of specifying extra mount arguments: "echo > /proc/fs/gfs". Whitespace munging. 2004-10-27 Lon Hohberger * Pass SHAREDIR to build. * Support 'forbid' flag to prevent resource children. 2004-10-27 Chris Feist Removing accidetally include defines.mk Added makefile Updated configure scripts to include a share directory (required by rg_mamanger) 2004-10-27 Michael Conrad Tadpol Tilstra Fixed some queue jumping. If the lock was held shared, and a new request for shared came along, the new request would succeed, even if there was another request in the conflict queues that came prior to it. 2004-10-27 Lon Hohberger Removed. Replaced by clusterfs.sh/netfs.sh 2004-10-27 Chris Feist Fixed rgmanger build process 2004-10-27 Lon Hohberger Add preliminary netfs (NFS) and clusterfs (e.g. GFS) resource agents. Add verify-all support to all file system agents. 2004-10-27 Chris Feist Added make directory for rgmanager Changed the build process for rgmanager so it more closely resembles the other directories in cluster. 2004-10-27 Ken Preslan Stop compiler warnings. 2004-10-27 Michael Conrad Tadpol Tilstra - If a node remounts on the jid while we are replaying thier journal, don't be mean and mark the jid as being free when we finish. - for real, remove the gulm_jid.h file. 2004-10-26 Michael Conrad Tadpol Tilstra - removed node locks. The things they worked around in the past do not need working around anymore. - remove gulm_jid.h works better if in gulm.h (not to mention some bits were wrong.) 2004-10-26 Chris Feist Updated version number Added a target to edit the release numbers when preparing tarballs. 2004-10-26 Michael Conrad Tadpol Tilstra - stop multiple local jid list scans on the same fs from trampling each other. 2004-10-26 Patrick Caulfield warning - large checkin. Change to using kernel_sendmsg & kvecs instead of sock_sendmsg & iovecs. This makes the code a lot neater in places and gets rid of some potential user/kernel address confusion. Oh, and it only works on 2.6.8 upwards. 2004-10-26 Lon Hohberger Fix x86_64 build warnings Fix warning on x86_64 2004-10-26 David Teigland A little optimization I've intended to do forever: get rid of the node list search in name_to_nodeid. Add a nodeid array to just index into. 2004-10-26 Patrick Caulfield Add priority to cman sockets 2004-10-26 Lon Hohberger Copy changes to old cman plugin. Fix bug feist was seeing during builds on x86_64 Change to match magma API changes (clu_members_lost->memb_lost) Take out unnecessary printfs. Change to match with new memb_lost/gained functions in magma. (1) Split up thread and non-thread libraries so applications which do not need pthreads can operate without linking against it. (2) Provide mutual exclusion/sleep hack around default-plugin's locking functions. (3) Change clu_members_lost/clu_members_gained to memb_lost/memb_gained to match up with other functions. (4) Add cp_connect function. (5) Add thread_test program for exercising multithreaded locking using magma's clu_lock/clu_unlock functions. (6) Clean up warnings in cluster_cmd Ensure we wait for the AST during the unlock as well as the lock. Simplify when we assign/free the lksb in the lock function. 2004-10-25 Lon Hohberger Make all the different libdlm targets use -fPIC during builds for proper symbol relocation on x86_64 2004-10-25 Michael Conrad Tadpol Tilstra - sometimes when starting multiple gulm server for the first time, some may drop out due to GenID conflicts. At times other than this startup case, this is what we want. But when we are just starting up like this, we'd rather that the server give it another try. So they do that now. 2004-10-25 Adam Manthei bug 137035 -- add support for configurable ssl port locations to fence_ilo bug 137037 -- Add support for perl-Crypt-SSLeay package to fence_ilo man page for fence_bladecenter Yet another method for fencing the IBM bladecenter. This method requires that the bladecenter is running with firmware new enough that provides a telnet interface to the management module. This effectivley deprecates fence_xcat and is now the preferred fencing agent for the IBM bladecenter. 2004-10-25 Lon Hohberger Per Patrick's comments: open/create lockspace when first lock is taken instead of at init time. 2004-10-25 Patrick Caulfield Remember to free connections[0] if initialisation fails. Don't deref freed skb. Thanks Ben. Close any created listening sockets "listen_for_all" fails Only wake astd when there is work for it to do. Also make the wake condition consistent with the check for work 2004-10-23 alan Put in new version number CVS patchset: 4205 CVS date: 2004/10/23 16:22:40 --HG-- extra : convert_revision : 1d99d965a27c5b739397ff9332f795d63655ffcb 2004-10-22 Lon Hohberger Add cp_connect() function to library. Make cptester only use cp_* functions. Use libdlm_lt with cman and sm plugins (removes dependency on dlm_pthread_*). Also make all the return value semantics match. 2004-10-22 Michael Conrad Tadpol Tilstra - update the patch. 2004-10-22 Benjamin Marzinski removed unneeded variable Updated gnbd-kernel patches for 2.6.9 2004-10-22 Michael Conrad Tadpol Tilstra - Updated lcok_gulm for the 2.6.9 kernel. 2004-10-22 Chris Feist Added cman-kernel patches for the new 2.6.9 kernel. 2004-10-22 David Teigland update update from src files Change the default drop-locks value to 50,000. Both the drop locks value and period are now configurable through proc. 2004-10-22 Patrick Caulfield Revert 1.40 as it causes astd to spin for some reason I haven't fathomed yet. 2004-10-22 David Teigland Update to the new plock lm interface and 2.6.9. This fixes the problem where the dlm and vfs locks could get out of sync. 2004-10-22 Michael Conrad Tadpol Tilstra Don't requeue front on conflict with priority flag. Else there is a race with multiple lock requests with priority flag that deadlocks request. 2004-10-22 Ken Preslan Update to 2.6.9. (Lock_dlm and lock_gulm are broken until updated by their owners.) 2004-10-22 Daniel Phillips Fix csnap_create so that csnap_destroy recovers dm devices on error Fiddle with command line arg order, really needs proper arg parsing 2004-10-21 Patrick Caulfield Add some (optional) stats collecting. Don't call wake_astd when we're doing a remote unlock. Add lock ORPHAN state, and associated query. Locks go into this state when they are marked PERSISTENT and the user process exits without unlocking. Thanks to Stanley Wang of Intel for most of this. 2004-10-21 David Teigland mention how to use a different port for cman udp, not tcp 2004-10-20 Ken Preslan Update patch. Stuff all of the assert message into the panic call. 2004-10-20 Chris Feist Removed a debug line from the Makefile, which is not necessary anymore 2004-10-20 andrew 2 new CRM tests CVS patchset: 4181 CVS date: 2004/10/20 14:01:47 --HG-- extra : convert_revision : a07ab0861bd889c2ec422aef92214ece4b61bcd7 2004-10-20 David Teigland We need to call process_requestqueue() when finishing a first recovery (upon joining the lockspace). Although rare, it's possible to get dir lookups from other nodes prior to our own finish. These need to be taken off the requestqueue and processed. Hung gfs mounts are one evidence of this bug. 2004-10-19 Benjamin Marzinski Fixed gnbd_import and gnbd_export so that -r with no devices returns an error. This still needs to be fixed in the 6.0 code 2004-10-19 Michael Conrad Tadpol Tilstra - Added command to gulm_tool to query the services that have connected to a given gulm_core. 2004-10-19 Patrick Caulfield Changed the protocol header to include the source port number so it can be passed to an application. I've moved things around in the header so if you try to use an old cman with this one they should just not see each other. Also widen the flags to an int32 so we don't havd to do that ugly shifting when checking them...it also keeps the structure aligned correctly. 2004-10-19 David Teigland a "minimum gfs" howto that I wrote up a long time ago: one dedicated gnbd server and two gnbd clients as gfs nodes, haven't actually walked through and tested it, so... A slight variation on one of the recovery special cases wasn't handled correctly; add small fix and extend comment description. the journalid code inherited the hold_lvb changes for gfs; it's not necessary so skip update per build improvements by cfeist and anticipating a 2.6.9 kernel 2004-10-19 Daniel Phillips Agents now instantiate servers using a dlm-based protocol 2004-10-19 Chris Feist Misspelled the module_dir in the previous checkin. (It was moduledir) Modified the 'make all' command to install the files into the cluster/build directory. This allows one to do 'make all' and build everything w/o actually installing files somewhere on the system. Removed old includes from the kernel. Added an include for ${incdir} in case inc dir is not /usr/include 2004-10-18 Patrick Caulfield Add a loopback to the comms layer so that clients can send messages to themselves or other clients on the same node. Note that, by default, broadcats sends will still /not/ be copied to the local system unless MSG_BCASTSELF is added to the flags. Compile two DLM libraries, one that needs pthreads, and one that doesn't 2004-10-16 alan Added core dump directories, and a bunch of code to cd into the right core dump directory, and activated that code in several different applications. Note that I didn't do them all -- in particular the SAF/AIS applications haven't been touched yet. CVS patchset: 4154 CVS date: 2004/10/16 04:12:56 --HG-- extra : convert_revision : 1b308ef8a7cde6508fc9f1e21a244114835f76b1 2004-10-15 Michael Conrad Tadpol Tilstra - i know they're unused, but it hurt my brain less if they're atleast correct. 2004-10-15 Daniel Phillips Add agent.c and sendagent.c 2004-10-15 Jonathan Brassow - fix for bug rbz134282 (bad handling of port number specification) 2004-10-15 Michael Conrad Tadpol Tilstra - stop printing out lock ids in different formats. its confusing. Here's a patch to fix a gulm_tool bug that allows a user to enter the commands `gulm_tool nodelistcrap $server` and `gulm_tool nodeinfocrap $server $node`. patch from adam manthei. 2004-10-14 Daniel Phillips Sigh. Get the gfs bits out of the patch Teach diff about dm-csnap.c and dm-csnap.h 2004-10-14 Patrick Caulfield A simple DLM "hello world" example from Daniel. Improve support for PERSISTENT locks, and make the lkid checking a bit more paranoid. 2004-10-14 Michael Conrad Tadpol Tilstra - This removes the limit of 300 jid mappings. - get builds with --prefix working. 2004-10-13 Ken Preslan Don't align the glock hash buckets. The gfs_sbd structure is now about 10 times smaller. Formatting. o Fix bug #135249 with a lot of help from Dave Teigland. Ever since the create transaction was broken into two transactions, it was possible for the second transaction to happen without having to allocate disk space. When this happened, GFS wasn't locking the resource index before searching it. Add the correct locking. o Other munging. 2004-10-13 Patrick Caulfield add DLM_SBF_VALNOTVALID so that the examples compile again. It's not implemented yet, but will be. Quick overview of the libdlm function calls. Dear, oh dear this was out of date. This should be a lot better but please report anything still amiss. Fix lock modes constants that were incorrect. Simple LVB test prog 2004-10-13 David Teigland use simpler kthread routines for serviced 2004-10-13 Patrick Caulfield Open the default lockspace if dlm_get_fd() is called before any other locking operations. Make sure that any routines calling pthreads are surrounded by #ifdef _REENTRANT so a non-pthread library can still be built. 2004-10-13 David Teigland fix dependency 2004-10-13 Michael Conrad Tadpol Tilstra - VFS does weird things with the error results, so before we try to return a gulm error code, flip it to -1. This fixes bz#132772 - don't return positive error codes on loc module mount failures. does not fix bz#132772 2004-10-12 Michael Conrad Tadpol Tilstra - plug leaky memory. 2004-10-12 Ken Preslan I killed a bunch of people once. 2004-10-12 Patrick Caulfield Move pingtest into userspace and check for invalidated LVBs Also some cosmetic changes to the other tests that were lying around in this directory when I typed "cvs commit" 2004-10-12 David Teigland another shot at correctly stopping dlm_recoverd thread when shutting down the lockspace, the kthread routines aren't much help in this situation 2004-10-12 Patrick Caulfield Don't free "othercon" connections when they are closed as they might still be busy in the read loop. 2004-10-11 Lon Hohberger Handle default value in RA parameters Remove references to handlers for RAs. They're not needed. 2004-10-11 Michael Conrad Tadpol Tilstra - don't logout on kill -TERM 2004-10-11 Patrick Caulfield Quit rx loop if the thread is closing down. 2004-10-11 David Teigland Avoid doing a synchronous unlock in unhold_lvb by just letting the lock_dlm thread delete the lock struct when the unlock completes. This prevents a possible deadlock between lock_dlm threads where one is doing recovery and the other is doing a gfs callback which can do an unhold_lvb. 2004-10-11 andrew Missing ignore files CVS patchset: 4106 CVS date: 2004/10/11 09:44:19 --HG-- extra : convert_revision : b8359b0307b9a0a9fc3b53c7314e30a23a5939e6 2004-10-11 zhenh add cts classes for crm CVS patchset: 4105 CVS date: 2004/10/11 07:40:43 --HG-- extra : convert_revision : 89f00fe56a4f65b1c935c269db424f58524e4f3c 2004-10-09 alan Changed LRM BasicSanityCheck so that it's generated from a .in file, so that the library pathnames, etc. can be built into it. CVS patchset: 4096 CVS date: 2004/10/09 14:11:38 --HG-- extra : convert_revision : b934e2468a188708111499a6ce7501485cece36f 2004-10-09 lge 199Kb patch transforming all // I could find into /* */ have a lot of fun :-/ CVS patchset: 4095 CVS date: 2004/10/09 01:49:40 --HG-- extra : convert_revision : 698301c47f6b2338c1c0efef012142de46185b35 2004-10-09 Ken Preslan o Support for immutable and append-only files with a lot of help from Anton Nekhoroshikh o Use intents to help with O_EXCL creates instead of the kludge we were using before. o Other cleanups. 2004-10-08 Lon Hohberger More doxygen stuff 2004-10-08 alan Made a slight ANSI fix to portability.h Apparently with -ansi enabled, inline is illegal... CVS patchset: 4085 CVS date: 2004/10/08 20:11:16 --HG-- extra : convert_revision : 4f3dd7e1826bf7e03998c98970e6fbbf454dd577 2004-10-08 Patrick Caulfield Close the lowcomms sockets during recovery rather than as soon as cman notices the node has died. This should help synchronise things a bit better and (hopefully) avoid spurious disconnect/reconnect events that could happen. Forgot to set node_state when a node is killed via STARTTRANS 2004-10-07 alan Fixed a very sophmoric error in the --enable-all feature in configure.in CVS patchset: 4063 CVS date: 2004/10/07 20:47:23 --HG-- extra : convert_revision : a5daff96045891136be6865b85020cfa41bb2a93 2004-10-07 Ken Preslan Print a message before, as well as after, trying to mount the lock protocol. 2004-10-07 Daniel Phillips Add "manual" csnap server failover 2004-10-07 Ken Preslan Munging capitalization and other stuff. 2004-10-07 Michael Conrad Tadpol Tilstra - Initalize entire sockaddr6 for binding. - Fixed tiny memory leak in config parsing. - was double decrementing the expired holders counter. stopped that. 2004-10-07 alan Changed configure.in so that if someone enables swig that it will abort the install if anything is missing. If you don't want this, then don't enable swig ;-) CVS patchset: 4058 CVS date: 2004/10/07 14:44:37 --HG-- extra : convert_revision : a674389055ceaabd89b35354bf783673b2d2136b 2004-10-07 Patrick Caulfield Tidy the close_connection() routine. in addition add a parameter telling it whether to close an attched "othercon" or not as close_connection() is often called to shut down in the event of an error and we only want to close down "this" socket, not others that may be busy. 2004-10-07 Daniel Phillips Updated design document with details on server failover 2004-10-06 Michael Conrad Tadpol Tilstra - the kernel side of things for getting the posix range F_GETLK cmd. - the kludging required server side to get something that posix can use for the F_GETLK command. 2004-10-06 alan Updated things in preparation for 1.99.0 CVS patchset: 4048 CVS date: 2004/10/06 16:39:40 --HG-- extra : convert_revision : 0b1f99710c70292cd86f31837d4a3e32dae2d2e7 2004-10-06 Michael Conrad Tadpol Tilstra - try to be a bit clearer about how you can name thigns in the servers list and what gulm does with that. 2004-10-06 Daniel Phillips Remove out of date 2.4.26 kernel patch csnap-agent now accepts multiple local connections, with polling (and so multiple csnap devices on one node works again). 2004-10-06 David Teigland sync up flags with dlm.h 2004-10-06 Ken Preslan More comments from ben.m.cahill@intel.com. 2004-10-05 Lon Hohberger More comments Add Doxyfile. Add lots of Doxygen-readable comments to code. 2004-10-05 Ken Preslan A patch from ben.m.cahill@intel.com to install the gfs_mount man page. Fix a few bugs in the eattr/acl code. Other munging. 2004-10-05 alan Added configure code to detect if chown works for non-root users. CVS patchset: 4023 CVS date: 2004/10/05 19:39:20 --HG-- extra : convert_revision : 3ce6395a0f434c2b3f520f788997c83e2e6cc7e6 2004-10-05 Ken Preslan Break compound assert into two. Munge. blerg. A command line tool that's a bit smarter about reading a GFS filesystem than gfs_edit is. 2004-10-05 Patrick Caulfield Update highest_nodeid at the client-end of a transition too. There is a tiny chance this could have been causing some memory corruption crashes. 2004-10-05 lars Round one of the STONITH cleanups. Still a bit rough around the edges, but hey, it's a development branch. Cuts out roughly 2kLoC and compiles, so it must be perfect. CVS patchset: 4019 CVS date: 2004/10/05 14:26:16 --HG-- extra : convert_revision : c6ec6f76ddc5bc4af1074867ae660dfdabd9fc29 2004-10-05 Patrick Caulfield JOINING timeout should go back to JOINWAIT rather than giving up. 2004-10-04 Lon Hohberger Remove MAX_MSG_SIZE limitation 2004-10-04 Daniel Phillips Move development to 2.6.8.1 Replaced csnap-2.6.7 patch by csnap-2.6.8.1 Renamed service.c to agent.c Added unlink for non-abstract named sockets Cleanups in dm-csnap.c client 2004-10-04 Michael Conrad Tadpol Tilstra - fiddled with a few comments. - make the dir that pid lock files go into now. - add name and ip to the cmdline arg help. 2004-10-04 Patrick Caulfield Though Id checked this in last week. Cope with sparsely allocated (and large) nodeIds. Gets rid of the "max_connections" config variable 2004-10-04 David Teigland update from src files lock_dlm needs to return EAGAIN itself we must provide the correct astarg to dlm_unlock now that NULL is valid 2004-10-04 Daniel Phillips Get rid of startup race by having csnap server fork its own daemon after successfully binding to port, instead of letting shell do it. Change to named socket for device control connection 2004-10-01 gshi Add Alain St-Denis's stonith code for Compaq's RILO three files are added lib/plugins/stonith/README.riloe lib/plugins/stonith/ribcl.py.in lib/plugins/stonith/riloe.c This code compies but I don't have the hardware to test the code before I commit. CVS patchset: 4006 CVS date: 2004/10/01 20:12:49 --HG-- extra : convert_revision : bb43a5b3ac389ac0ea4765b768846d2c1ea3104b 2004-10-01 Michael Conrad Tadpol Tilstra *sigh* :wq - fix inaccuracies in the counters. 2004-10-01 lge make --enable-ansi work on linux does only affect users of gcc with --enable-ansi does not hurt anything else CVS patchset: 3992 CVS date: 2004/10/01 12:04:12 --HG-- extra : convert_revision : e2504ab613afffedafb0c8ed0351080bac7b34ce 2004-10-01 Patrick Caulfield Kernel check for max nodeid too. in case anyone bypasses cman_tool 2004-10-01 Daniel Phillips Target now asks for connection when it needs one Service.c now looks more like a failover daemon The real devspam.c data checking utility checked in 2004-09-30 Patrick Caulfield I am an idiot Honour the wanted_nodeid when we are the first node in the cluster. Don't say "we are leaving the cluster" when we never actually join it. Sanity check the nodeID Make sure the nodeids array is increased enough to cope with very large increments in the nodeid. I don't know why anyone would want to add a node id "1000" to a cluster with 10 nodes in it, but it shouldn't oops when you do :) 2004-09-30 David Teigland always take the ast arg from dlm_unlock(), even when NULL 2004-09-30 Michael Conrad Tadpol Tilstra - must remember to decrement holder count when removing holders from list. 2004-09-29 Michael Conrad Tadpol Tilstra - changes to match last change in userspace. 2004-09-29 Daniel Phillips Now works with unmodified device mapper, hack removed Create virtual device by execing dmsetup Add example code for establishing connections 2004-09-29 zhenh add /var/run as HA_VARRUNDIR CVS patchset: 3952 CVS date: 2004/09/29 03:40:39 --HG-- extra : convert_revision : d9bd689606476fc00c3b03436ea287c80a7bed05 2004-09-28 Daniel Phillips Socket-over-socket control interface now functional 2004-09-28 Lon Hohberger Port from clumanager 1.2.x: potential fix for #114388 2004-09-28 alan Added BEAM configuration file Updated all our .cvsignore files (BEAM things mostly) CVS patchset: 3947 CVS date: 2004/09/28 14:37:01 --HG-- extra : convert_revision : 40b700fc876b2775df15cbad9877c6cce1ffa314 2004-09-28 Patrick Caulfield Wait for membershipd to shutdown before starting to clean up. This should ensure that the LEAVE message will get sent. Get rid of REMOTEMEMBER the kernel's interface numbers start at 1, not zero. So we need to increment "number" before passing it into cman. 2004-09-28 andrew Missed one CVS patchset: 3945 CVS date: 2004/09/28 08:55:52 --HG-- extra : convert_revision : 7a20f1e2ee812a64f3d4bdc13040d5c462805b36 Generate these so they can be moved (and still find the right helper scripts) CVS patchset: 3941 CVS date: 2004/09/28 08:45:41 --HG-- extra : convert_revision : 642f68ae6327b548d48ce1ab307cb2cb7071d492 2004-09-28 Patrick Caulfield Remove state REMOTEMEMBER as it's not been used for ages. I've started the enum at 1 so that the other don't move though... Also cope better with dodgy networks during join. Send JOINREQ messages until we get a JOINACK or we timeout rather than just sulking. 2004-09-27 Michael Conrad Tadpol Tilstra - removed unused function. 2004-09-27 Patrick Caulfield Make /proc/cluster/status a bit more consistent in its output, and also show a little more when when not in a cluster. 2004-09-27 Michael Conrad Tadpol Tilstra - removed some code that wasn't called anymore. - added some static initializers. 2004-09-27 Patrick Caulfield Print "left cluster" reasons as text. Tidy daemon shutdown so that membership is always responsible for sending LEAVE messages and will only do them duting its shutdown. There are still a few send_leave() calls in other places but these are all just before a panic. Only stop recoverd if it is running 2004-09-25 Daniel Phillips Actually commit (cvs, why are you so broken?) 2004-09-24 msoffen Added --enable-all feature. Instead of needing to type every feature, just type this one. CVS patchset: 3892 CVS date: 2004/09/24 21:17:01 --HG-- extra : convert_revision : f7fb223345527c9bf196f1628cf7adaecca67c8c 2004-09-24 Michael Conrad Tadpol Tilstra - have magma use the new quorate info. - update the gulm lib to reflect new quorate info. - fixing an ism of gulm's gfs-only past. There was not a way for services on clients to truely know if the gulm servers were in quorate or not. So there is a new message that gets pushed out when quorum is gained or lost. this is related to bz #129879 2004-09-24 Daniel Phillips Added devspam.c device data verification utility Added socket-over-socket connection passing, not tested yet! 2004-09-24 David Teigland use down_interruptible, fcntl may return EINTR 2004-09-23 Lon Hohberger First pass at resource tree deltas (tested). 2004-09-23 Michael Conrad Tadpol Tilstra - these build again. (forgot to update them with library.) 2004-09-23 Patrick Caulfield Free unused direntry structs when releasing a lockspace. 2004-09-23 David Teigland An agent's error message should now be recorded in syslog (worked in my test). Not sure why WIFEXITED didn't work as expected. 2004-09-22 Benjamin Marzinski fixed printing issue for 6.1 2004-09-22 Michael Conrad Tadpol Tilstra Basic posix range locks work in gulm now. Lots of changes required to the gulm lock module to get this working. In order to get the plock code in place, the main lock request pathways were rebuilt. This made the code cleaner, as there were some pretty darn scary things going on before. The down side to this is that I changed the lock pathways for everything. The basics do seem to work though. Much of the non-working bits of the jid code were removed. So we're still stuck at max cluster size of 300. Will deal with this later. Through the course of this, I found some files with code I wasn't using. They got removed. Some code was moved around. Some was pulled out of where it was randomly shoved originally, and put into its own file. There is most likely some other cleanup type things that can still be done. Mount/unmount on two nodes has been tested. ~5 minute runs with make_panic has been tested. Couple of plocks can be grabed and released. There have been NO recovery tests done yet. 2004-09-22 msoffen Added /usr/ucb for whoami check. CVS patchset: 3870 CVS date: 2004/09/22 13:06:46 --HG-- extra : convert_revision : 18b34df794542d1353b94db8f1b4346b53a37d20 2004-09-22 Patrick Caulfield Add module ownership to various structures so we don't get unloaded whilst busy. Might fix bz 133142 but I want to do some more testing first. Fix bug in debugging routine. I hope we don't need it any more though... Use C99 initialisers. Also include module reference in file_ops. 2004-09-22 yixiong Adding a SNMP Agent BasicSanityCheck. CVS patchset: 3869 CVS date: 2004/09/22 06:12:09 --HG-- extra : convert_revision : 4734239311d8cf895f66ea099fd4fe5d268b87f2 2004-09-21 Benjamin Marzinski even though you now fence nodes by their cluster name, not their IP address, fence_gnbd still needed an 'ipaddr' parameter, and the man page still said that you should use hostname or IP address. 'ipaddr' is now deprecated, and a warning is printed when it's used. The new parameter is 'nodename'. The man page was also updated. 2004-09-21 Ken Preslan Make an assert be more verbose. Do better asserting in glock_hold(). 2004-09-21 Patrick Caulfield Yet another one of those "I can't beleive we've not seen it before" bugs. When we close a connection lowcomms now removes any pending writes to that node from the writequeue too. I bet this has been responsible for some wierd behaviour in the past... 2004-09-21 David Teigland change the way we check for local plock conflicts 2004-09-21 Patrick Caulfield Always return non-zero exit code if we hit an error. 2004-09-21 Benjamin Marzinski Updated patches to 2.6.8.1 2004-09-20 Patrick Caulfield Fix that deadlock that's been there for ages but only ever pops its head up when you're looking for some other bug. Tidy. 2004-09-19 alan Added a tweak to allow BEAM to know not to run during configure otherwise configure takes forever. CVS patchset: 3830 CVS date: 2004/09/19 21:41:55 --HG-- extra : convert_revision : 77f48266df7100f1b0bd76f6f91c25b99ee36312 2004-09-17 msoffen Fixed -ansi to enable ansi CVS patchset: 3815 CVS date: 2004/09/17 14:06:19 --HG-- extra : convert_revision : 188887de28b8564472b20ec20c7a07921c92f46d 2004-09-17 Patrick Caulfield Add (untested) SELinux support Be a bit more paranoid about creating the DLM device node. If it doesn't exist but the lockspace does, then look up the minor in /proc/misc and create it. 2004-09-17 David Teigland get rid of "dir entry exists" messages add a couple syslog messages when nodes are fenced 2004-09-16 Jonathan Brassow - fix for bug rbz 132680 When an update was issued, the new conf got written to all nodes, but the daemon did not pick it up in memory until a request was made. So, if an additional update was done from another node before a request was made and it failed, the current in-memory (stale) version was rewritten to disk. 2004-09-15 Jonathan Brassow - remove unneeded print - add debugging line 2004-09-15 alan Put several fixes related to glib2 CVS patchset: 3769 CVS date: 2004/09/15 19:24:53 --HG-- extra : convert_revision : af1d49d5b0382b408784fc2b604b1149b9d86b15 2004-09-15 Michael Conrad Tadpol Tilstra - The first pass at getting range locking for gulm. This is server changes only. So far the basics seem to work. gulm should be implementing range locks that are compatible to what posix range locks expect. Gulm is lazy about merging ranges though, but this shouldn't affect the usage any. 2004-09-15 Jonathan Brassow - fix make clean so it cleans up binaries in bin dirs - add pthread_detach to ccsd (cluster_mgr.c) - print select failure in ccsd if not EINTR 2004-09-15 Patrick Caulfield Add support for allocated node IDs, both on the command-line and from CCS Add support for assigned nodeids so that people can have permanent node IDs assigned in cluster.conf if they really want it. Also tidied the joinconf code so that the validation is in a seperate routine. Yes, I know I should have seperated these out into two commits. Sorry. 2004-09-15 David Teigland Put a lock around start and exit of dlm_recoverd thread. This should fix the oops I got waking a dlm_recoverd that wasn't there. 2004-09-15 zhenh fix a bug CVS patchset: 3754 CVS date: 2004/09/15 01:15:36 --HG-- extra : convert_revision : 908cbf852b97ef5284fc85e49c3815a9e6cd9233 2004-09-14 Benjamin Marzinski there was a problem with running two gnbd_clusterd processes at the same time. This fixes that problem. 2004-09-14 Adam Manthei fix typos that Erling found. move some functions around so that the script compiles 2004-09-14 Lon Hohberger Update tasks 2004-09-14 gshi change glib API to glib2 API CVS patchset: 3743 CVS date: 2004/09/14 15:07:28 --HG-- extra : convert_revision : 8664733b952f2d5d4acd9144fcbb4f7573145b86 2004-09-14 lars OCF_RESOURCE_INSTANCE cannot be set on meta-data operation, as this applies to the class and not to an instance. CVS patchset: 3742 CVS date: 2004/09/14 09:46:41 --HG-- extra : convert_revision : eaec43a4cf522e8a0c851cd3f9c57b9711ea9eb8 2004-09-14 Patrick Caulfield Remove ourself from the waitqueue when told to quit. 2004-09-14 zhenh add the defination of HA_D CVS patchset: 3738 CVS date: 2004/09/14 07:49:41 --HG-- extra : convert_revision : ede323932f4595b9f9c8a1a1ead09b1aeef862a9 add 'then' to if statement CVS patchset: 3737 CVS date: 2004/09/14 07:48:43 --HG-- extra : convert_revision : de743d6772040ed4d56f593f719f5725afe49d71 2004-09-14 sunjd fix configure break -- try to make a none-existed Makefile after files moved CVS patchset: 3735 CVS date: 2004/09/14 06:07:03 --HG-- extra : convert_revision : b439a064e725cd58288cbd245e4db3e6d7017897 2004-09-14 Michael Conrad Tadpol Tilstra - added --use_ccs to --help - added --use_ccs to man page. 2004-09-14 Ken Preslan This should be a fix for bug #126531. The issue has to do with the fact that GFS locking is only invoked by a mmap during the page fault. There is no way for GFS to know that the process causing the page fault has finished reading or writing to the page for now. The way GFS has worked in the past is that as soon as the page fault has completed, the lock protecting the file is available to be demoted or dropped at the request of a callback from another machine. This means that a newly faulted-in page could get yanked out from under a process before it gets a chance to do sufficient work on the page. This checking provides a way for the page fault code to make a machine be greedy with a lock for a certain period of time. This means that after the fault, GFS will ignore callbacks for a certain period of time. After that time, GFS will respond to any callbacks that have arrived and then mark the file's lock as being able to be made greedy again. The code is setup so that the amount of time that the lock is marked greedy is dependent on how much page faults traffic there is on that file. When there is a lot of traffic, the time increases. When there is a little traffic, the time decreases. Whitespace munging. 2004-09-13 Jonathan Brassow - fix memory leak + other minor anoyances 2004-09-13 Lon Hohberger Zero out struct sockaddr_in6 structures before using... 2004-09-13 Michael Conrad Tadpol Tilstra - from the top, its lib/libgulm.a now. - clear out socket_in6 before using them. - from the top, its src/lock_gulmd and src/gulm_tool now. - moved the option parsing code to where it was a while ago. I didn't make any comments last time as why I moved it to where it was, so while I imagin I had good reason, I cannot recall what ti was. So putting it back, but leaving crumbs just in case. - If you want gulm to get configs from ccs, you must pass the --use_ccs option. The code that auto-detected ccs is broken, and I've got bigger things to do for now. I'll deal with this again later. - added some command line options to override the auto-detect for node name and ip. They're going to remain mostly undocumented for a while yet. 2004-09-13 Lon Hohberger Remove MAX_MSG_SIZE limitation 2004-09-13 Adam Manthei man page updates for the fence_rib and fence_ilo agents. adding the Net::SSLeay version of the fence_rib agent. This should be used by those with iLO interfaces instead of the fence_rib (stunnel) version of the agent. Add ribcl version 2.0 support to fence_rib. Note that later versions of the iLO firmware may not work well with stunnel, Therefor it is better to use fence_ilo which uses the Net::SSLeay perl module instead. 2004-09-13 Michael Conrad Tadpol Tilstra - cleaned up the makefile as per aj req. 2004-09-13 Lon Hohberger dumb was requiring libxml2 to build for no reason 2004-09-13 Patrick Caulfield Remove distracting comment to which the answer was "no, it doesn't" 2004-09-13 Lon Hohberger Periodic status checks, updated resource agents, misc. bugfixes 2004-09-13 A. J. Lewis o allow CFLAGS to be added to from the command line 2004-09-13 sunjd make OCF_RA_DIR export to config.h.in CVS patchset: 3718 CVS date: 2004/09/13 10:16:46 --HG-- extra : convert_revision : 87a8256c3dead505b062b04a50708a2863e35561 2004-09-13 Patrick Caulfield Allow a bit more flexibility in how nodes are specified in cluster.conf You can now use fully-qualified host names as well as abbreviated host names. If you do use FQDNs in there then uname must return the full name too OR you can override the hostname by using -n on the command-line. For cman, I recommend NOT using full hostnames in cluster.conf 2004-09-13 lars Path to OCF_ROOT fixed. CVS patchset: 3713 CVS date: 2004/09/13 08:30:51 --HG-- extra : convert_revision : ac19898a336e7cf10b9fd8a97a7e48b4995bf82c 2004-09-13 Patrick Caulfield There is a small chance that rem_node could be NULL when passed into process_cnxman_message if people are doing silly join/leave test loops :) 2004-09-13 sunjd fix a pathname error which break the building CVS patchset: 3709 CVS date: 2004/09/13 03:08:59 --HG-- extra : convert_revision : 612520a8479d1558bbce38e300a1d19b0a55ff4f 2004-09-11 alan Put in fix making missing xml libraries fatal when CRM requested Put in Solaris Fixes from David Lee. CVS patchset: 3703 CVS date: 2004/09/11 18:25:13 --HG-- extra : convert_revision : 5c87639eef249ec8e59c3017fe30e3a857cccc7b 2004-09-11 Michael Conrad Tadpol Tilstra - temporarily remove ccs from gulm. Its way not working. Need to figure out why, but tired of people telling about it not working when I already know. 2004-09-10 Ken Preslan Add a missing -Wall. 2004-09-10 A. J. Lewis o CFLAGS+= instead of CFLAGS= so you can add options from the cmdline 2004-09-10 Adam Manthei From Lazar Obradovic: this is a patch to allow fence_ibmblade to use udp port other than standard snmp (udp/161). Main reason for this is that IBM BladeCenter MM supports only 3 hosts and not hostgroups per community, and only 3 communities, which puts a limit to maximum of 9 nodes in a cluster. This is somewhat inconvinient, so, as a workaround, one can install a udp forwarder on some node(s) (preferably outside cluster) and use only its address in IBM BladeMM configuration. Port Forwarder will probably have to use some other port that standard snmp, not to block snmp access of "relay" node, so that's what this patch is all about. 2004-09-10 Patrick Caulfield Make sure we wake the membership thread before waiting for it to complete Make all node removal happen in remove_node(). This has the nice side effect that kernel listeners get notified of all nodes that ar removed :-) 2004-09-09 Michael Conrad Tadpol Tilstra - use safer naming for the dump files that go into /tmp. 2004-09-09 Patrick Caulfield Clear the use_count if shut down with "force" Free cluster ref if we fail to start. Don't get blinkered when in NEWCLUSTER modes...there may still be things out there! or, (in sensible language) Also send JOINREQ message when in NEWCLUSTER as well as STARTING states. 2004-09-09 alan Put in a configure.in fix due to David Lee CVS patchset: 3662 CVS date: 2004/09/08 22:10:50 --HG-- extra : convert_revision : 1c28e1c51490f20d97f4f435d34811a54cc2edfe 2004-09-08 Michael Conrad Tadpol Tilstra - found some stuff I forgot to remove. 2004-09-08 Ken Preslan Great comments from ben.m.cahill@intel.com. 2004-09-08 Michael Conrad Tadpol Tilstra - turned off a couple debugging things. - The first 'half' of range lock support for gulm. Doesn't actually do range locks, but all of the io for them should be there. - Adds support for finer grained uniqueness of lock holders. Very easy to use this to provide per process locks. (magma plug for gulm now does this.) - Everything should work as before. Only notable changes should be the libgulm.h and output of dump files. 2004-09-08 Patrick Caulfield Fix a subtle bug in the node IDs code where a node could get a different node ID than last time if all the other members who knew about it before have also been down in the meantime. It also fixes a possible issue wherere nodes didn't have the same information about DEAD nodes, as only MEMBER nodes were propogared at join time. Now, MEMBER & DEAD nodes details are distributed so that the whole node view should be completely consistent across the cluster. Tweak the way NEWCLUSTER works in an attempt to prevent the splits that can happen if a new nodes all try to join/form a cluster at the time. Based on an idea by dct, this sends NEWCLUSTER messages out repeatedly after a time of patient watching and waiting. there then could be a tie-break based on the IP addresses. This is for bz #126991. 2004-09-08 Ken Preslan Added "cmd" argument to lm_plock(). 2004-09-07 Lon Hohberger OCF Actions + Autostart param addition 2004-09-07 Ken Preslan Dave pointed out that I screwed up plocks. 2004-09-07 David Teigland - if a first start was aborted after getting past node setup, the nodes would be left over for the subsequent start -- need to clear them - remove some unused code from nodes file and tidy clearing nodes list use kthread routines 2004-09-06 Patrick Caulfield Improve (I think) the usage message. Well, it's more verbose anyway. 2004-09-06 David Teigland gfs needs a positive error from plock_get if there's a conflict 2004-09-06 Patrick Caulfield Change cman_tool kill to take a node name instead of a node number. This is more useful and more intuitive and more consistent. Rename "local_nodeid" so it doesn't clash with kernel's internal use. 2004-09-06 lars Heh. Cut & paste from ipaddr2 ;) CVS patchset: 3634 CVS date: 2004/09/06 10:33:02 --HG-- extra : convert_revision : 3e4b6869fb7e0a9b0f75934adea4b3a5680747fd Some more comments to think about. CVS patchset: 3633 CVS date: 2004/09/06 10:32:37 --HG-- extra : convert_revision : 1391e47f55d34bfae7e9ca2275e77588adc68fa0 drbd OCF RA as proof-of-concept code. CVS patchset: 3632 CVS date: 2004/09/06 10:20:35 --HG-- extra : convert_revision : 6eda49bd09b95bc0e833554d7e8bde8c5942f889 2004-09-06 David Teigland use kthread routines remove a couple log_debug's 2004-09-03 Daniel Phillips Initial add, csnap code and docs 2004-09-03 Jonathan Brassow - this should get rid of the ENETUNREACH problems... as well as the CONNREFUSED that AJ was seeing. The new 'struct sockaddr_storage' structure I was using was not being memset before being populated. So, there were some extra fields at the end that were not right. Specifically, strace showed: NORMAL (no failure): sendto(11, "\7\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_INET6, sin6_port=htons(50007), inet_pton(AF_INET6, "ff02::3:1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 20 BAD (returns ENETUNREACH): sendto(10, "\7\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_INET6, sin6_port=htons(50007), inet_pton(AF_INET6, "ff02::3:1", &sin6_addr), sin6_flowinfo=34, sin6_scope_id=4144328704}, 28) = -1 ENETUNREACH (Network is unreachable) So, it appears that sin6_flowinfo and sin6_scope_id were crap. 2004-09-03 Ken Preslan The -mm kernel that includes our flock patch. 2004-09-03 lars Add checking whether iptables was found at configure time. CVS patchset: 3602 CVS date: 2004/09/03 14:40:34 --HG-- extra : convert_revision : 9806294d9c7601fdaf84e618ee0a61b7671c7c06 2004-09-03 David Teigland update from src files 2004-09-03 lars Adding OCF RA IPaddr2; iputil2 based IP address agent. Also capable of multiple incarnations, in which case it will automatically switch to configuring the Cluster Alias IP. Untested ;) CVS patchset: 3598 CVS date: 2004/09/03 10:01:37 --HG-- extra : convert_revision : b1c2e364108831f53245ffc07bef99c0a7061747 Cleanup logging. CVS patchset: 3597 CVS date: 2004/09/03 10:00:45 --HG-- extra : convert_revision : 7956674546e7e8cfc6a60e09154616742befe496 2004-09-03 David Teigland Changes to how dlm flags are used based on dlm changes. Change the way we deal with gfs's "hold_lvb" requirement: use a separate NL lock to represent a "held lvb" instead of replacing unlocks with conversions to NL. This way unlocks from gfs are always handled consistently, with a corresponding dlm unlock. Change the lock granting logic to mirror the behavior of VMS. The effect of QUECVT and EXPEDITE flags have also been changed to match VMS. A couple new flags have been added to allow for non-vms-like behavior that's required when using range locks. 2004-09-03 Ken Preslan o Fix the BLKGETSIZE64 ioctl for Linux 2.6. o Use lseek() to to determine the device size if BLKGETSIZE64 and BLKGETSIZE fail. Get rid of a couple more 2TB checks. 2004-09-02 Jonathan Brassow - a couple minor updates to multicast code - add a '-t' option for specifying ttl, default is still 1 (IPv4) - use 224.0.2.5 (IPv4) and ff02::3:1 (IPv6) as default multicast addrs 2004-09-01 Jonathan Brassow - correctly specify the libdir in configure as /usr/lib - add IPv6 support to ccsd - add multicast support to ccsd - make the ccsd port numbers configurable - update ccsd man page - make the config file location tunable - make pid (lockfile) location tunable - still debating whether to turn on the option 2004-09-01 Ken Preslan Get rid of some pool crap. Get rid of an arbitrary 2TB limit that was there to keep 2.4 from screwing up. Ben Cahill's new man page. Update patches. 2004-08-31 Ken Preslan Rework patch to fit with Trond's comments. 2004-08-31 Michael Conrad Tadpol Tilstra - fixed a few ickys. - added some tests. 2004-08-31 alan Changed configure script so that it doesn't stop when it encounters the first missing components, but goes on and catalogues them all before quitting. CVS patchset: 3552 CVS date: 2004/08/31 16:25:53 --HG-- extra : convert_revision : ddf3fe45ec5f72293a67f1b0d040befbdca298bb 2004-08-31 Ken Preslan Print out PIDs. 2004-08-30 Ken Preslan Fix s_maxbytes. 2004-08-30 Lon Hohberger Fix build * src/resources/*: Add status/monitor actions to metadata * include/list.h: Update to fix compiler warnings. This is not complete; it's better to add a 'field' to structures requiring list specs. * src/clulib/vft.c: Remove unnecessary pthread locks. * src/daemons/*: Misc. code cleanups. 2004-08-30 Ken Preslan Patches for a kernel that supports the new plock interface. 2004-08-29 msoffen Replaced all // COMMENTs with /* COMMENT */ CVS patchset: 3528 CVS date: 2004/08/29 03:01:11 --HG-- extra : convert_revision : 042610e668394c30bcaa96e6409152101bc1befb 2004-08-28 alan Put in changes to properly package our doc files and then took out the don't-auto-stop-on-unpackaged-files directive. CVS patchset: 3525 CVS date: 2004/08/28 15:23:21 --HG-- extra : convert_revision : 07d7bcee7a3e677cb99e3fab8f806aed1cee566e 2004-08-27 Ken Preslan Rearrange some asserts. Munge. 2004-08-27 lars If I make one more of those typos I'm gonna be uspet at my slfe. CVS patchset: 3507 CVS date: 2004/08/27 15:05:11 --HG-- extra : convert_revision : e6b12505f9648768d24a44d2657bb85f8e996c40 Really silly fixes. CVS patchset: 3506 CVS date: 2004/08/27 14:47:35 --HG-- extra : convert_revision : 0f11d4f8e2fd8de0ed087fe6c0a1d697abc5a309 Further cleanups. CVS patchset: 3505 CVS date: 2004/08/27 14:33:46 --HG-- extra : convert_revision : b64b46a3e5e669a19a9ee8feb3511837af5d572f Simple typo fix. CVS patchset: 3504 CVS date: 2004/08/27 14:28:21 --HG-- extra : convert_revision : 4119ed994619bd18bf6c197615a82c9f0b8359aa Add Dummy OCF Resource Agent which is somewhat easier to verify ;) Template for a ocf-shellfunc library. CVS patchset: 3503 CVS date: 2004/08/27 14:12:24 --HG-- extra : convert_revision : 141fd84c971140d17a8bfca3e1402b600985e776 Some more cleanups... CVS patchset: 3502 CVS date: 2004/08/27 09:52:10 --HG-- extra : convert_revision : 6fad52ac773c0c823deadb709704f9c277314780 Generate the files in the new location. Cleanup some path substitutions... CVS patchset: 3501 CVS date: 2004/08/27 09:49:30 --HG-- extra : convert_revision : 71fe4e07a92f2891c265c022aa6e01033dd4bb42 Move OCF resource agents to /usr/lib/ocf for now; prior to full LSB approval we may not create a directory directly under /usr. Also move OCF RAs to a heartbeat/ subdirectory under there as specified by the OCF RA text. CVS patchset: 3497 CVS date: 2004/08/27 08:42:52 --HG-- extra : convert_revision : b5d92399b4c481c53a3694f39bac80f481ea1389 2004-08-27 Ken Preslan Reimplemented the flock patch to work similarly to the new way plocks will work in 2.8.9. The FS acquires the VFS flock from its code. Helper scripts for "gfs_tool lockdump". 2004-08-26 Jonathan Brassow - use indexing rather than ccs list handling to retrieve node names That way fence_init will not fail in the single node case. 2004-08-26 Adam Manthei I forgot to add Lazar Obradovic's fencing agents to the bin Makefile. Thanks go to anton@hq.310.ru for pointing that out. 2004-08-26 Patrick Caulfield So /that's/ why it was up there... A couple of endian fixes. mixed-endian clusters now seem to pass a cursory test. Don't send HELLO as soon as the kthread is started as that might b e too early. Fix stupid cut & paste bug with IPV6 multicast. 2004-08-26 deng.pan move the test cases out of the configuration CVS patchset: 3494 CVS date: 2004/08/26 03:49:27 --HG-- extra : convert_revision : 30c130b55e8298296e38eac463e55f8dd4220266 2004-08-25 Lon Hohberger Unbreak gulm. 2004-08-25 Patrick Caulfield Get rid of some debugging info that is no longer relevant (or even compiles). Also fix some that is. pack the comms structures. They are already well-aligned but the compiler does exceed its orders slightly in this case. use REPLYEXP for ISLISTENING messages 2004-08-25 David Teigland change the order in which we set up join/leave values so another thread can't see the SEVENT flag set before the pointer is set. 2004-08-24 Adam Manthei Fencing agents contributed by Lazar Obradovic for o Qlogic SAN Box2 o IBM Blade Center (requires SNMP support. Can be used instead of the xcat fencing agent) 2004-08-24 Patrick Caulfield Be a little less ACK-happy during transition. ACKs can now be embedded in data messages so, where we know there will be a reply there is no need for a separate ACK. Run dlm_recoverd only when we need to do recovery. Return the lockid to userland as soon as we know it. 2004-08-24 David Teigland patches for 2.6.8.1 use kthread routines to start/stop ast thread 2004-08-23 Ken Preslan Update to 2.6.8.1. 2004-08-23 sunjd move the OCF RAs location CVS patchset: 3490 CVS date: 2004/08/23 14:49:31 --HG-- extra : convert_revision : 600d0deada0f21bcb9ddfc1f691f4c0fd39ba7d2 change for moving OCF RAs location CVS patchset: 3489 CVS date: 2004/08/23 14:43:59 --HG-- extra : convert_revision : f03ffc18ddab9bb2c0146f54bf702f6e04e4787a add .cvsignore CVS patchset: 3487 CVS date: 2004/08/23 14:10:09 --HG-- extra : convert_revision : 2414b6e027e80bf36b1c6f46093dd5dfee2c3bfa 2004-08-23 Patrick Caulfield Use seq_file for /proc/cluster/services so we don't crash the kernel if there are more than a few lockspaces/GFS mounts etc. Setting a bad example... 2004-08-23 sunjd move here CVS patchset: 3486 CVS date: 2004/08/23 10:09:02 --HG-- extra : convert_revision : 53e191ffadf91c0dd403b0c54d123f20cebd0c4b 2004-08-23 David Teigland Use a recovery daemon per lockspace. This avoids deadlock when nodes recover multiple lockspaces in different orders. 2004-08-23 Patrick Caulfield Update libdlm doc. 2004-08-20 Patrick Caulfield Also include the pid in LKB rebuild Add PID to a lock which can be returned using the Query API. Thanks to Jeff Orlin for most of this code. 2004-08-19 Jonathan Brassow - forgot to remove eattr from the Makefile - don't need the gfs_eattr man page, as the tool no longer exists 2004-08-19 Patrick Caulfield Tidy LVB handling 2004-08-19 David Teigland a couple error messages 2004-08-18 Patrick Caulfield Change AF_ number to 30 so it doesn't conflict with bluetooth. TIPC also uses this number and you're unlikely to be using both together. Create /dev/misc with reasonable permissions. 2004-08-18 lars This ain't June anymore... CVS patchset: 3475 CVS date: 2004/08/18 13:40:56 --HG-- extra : convert_revision : 84f3b38f16ed01fbeef9bc63044d5cc526cb0d7e 2004-08-18 zhenh add some configure for the pathes used by lrm CVS patchset: 3472 CVS date: 2004/08/18 13:13:48 --HG-- extra : convert_revision : 829a3e87e86329c7ed56b358ccaddf8fed4a4071 2004-08-18 Patrick Caulfield Don't hang lkbs off the ownerqueue list as we don't have any control over their lifetime. Now that LKBs are destroyed before the ASTs are run this causes real problems. The ownerqueue is now strung through the lock_info structs themselves and we free those up when we can see that the lkb has been removed by the DLM core. This (for me) fixes Lon's SMP oops and also the dlmlock hang/oops mentioned in bugzilla #130148 2004-08-18 David Teigland if the first start is cancelled by a stop before finishing, reset the last_start value (same as with dlm and fence services) and discard nodes saved from initial invalidated start. Use a separate DELAY_RECOVERY flag to postpone join/leave processing during recovery. This leaves the DELAY flag used only for retries when joins/leaves conflict. The interference between recoveries and delayed join/leave processing could lead to stalled recovery or a timer oops. some spacing cleanup 2004-08-17 Benjamin Marzinski Forgot to update kernel stuff. 2004-08-17 Patrick Caulfield Fix userspace LVBs (thanks, Jeff) Fix assert when creating sublocks - if we get asked to create a sublock on a master node then that's quite correct. Remove duplicate check. 2004-08-16 Benjamin Marzinski updated Makefile so that it make distclean works 2004-08-16 Jonathan Brassow - it wasn't a good idea to add rm -f /lib/libmagma* to the make file 2004-08-15 Benjamin Marzinski Oops. Forgot to update the man pages. 2004-08-14 Benjamin Marzinski Bunch of gnbd fixes. The big one is that gnbd_monitor works like it's supposed to... Or at least if it doesn't, it's a bug, as opposed to just not being done. Most of the changes are under the covers, so accept for a more verbose gnbd_monitor list, all the UIs are the same. 2004-08-13 Lon Hohberger Temporary sledgehammer lock fix Clean up build Change cflags to make it build on newer compilers for now 2004-08-13 Jonathan Brassow - cman_tool.8 was still referenced in the cman_tool directory 2004-08-13 Lon Hohberger Initial checkin of rgmanager 2004-08-13 David Teigland update from src files 2004-08-13 Patrick Caulfield Knock the "I'm not getting out of bed for a message that small" size down to 20 bytes as the recovery code is quite capable of sending 21 byte messages. 2004-08-13 David Teigland update from src files use the force=2 option for dlm_release_lockspace() to disregard any existing locks remove assertion that's not correct when multiple lock requests are in progress on the same resource at once (e.g. plocks from lock_dlm). moved to man dir install man dir move cman_tool.8 into man dir, add Makefile to install man page for cluster.conf 2004-08-12 Lon Hohberger Remove crc32.c & build req. on it. 2004-08-12 Jonathan Brassow - move some stuff around. - fix brokenness when update occur simultaneously - more error checking 2004-08-12 Patrick Caulfield Don't update the bastaddr/param for a convert until the conversion has actually completed.. Patch from Jeff Orlin. Don't remove convert cancels from the list of locks owned by this process. 2004-08-12 David Teigland previous fix incomplete, one other copy to zero when a recovery interrupts first start, the last_start value needs to be reset to 0 too to avoid assertion failure copy the current config_version into the cl_version struct in GET_VERSION. "cman_tool version -r " can be used to update the config version of all cluster members. more refining of whether we deliver or skip bast delivery 2004-08-11 Patrick Caulfield I think it would be a really nice idea to actually pass the flags into the kernel, rather than ignoring them, don't you ? If the dlm_unlock() call fails, the put the lock back on the ownerqueue so it gets unlocked wen the device is closed. Don't grant locks that have are waiting for unlock. Don't deliver blocking ASTs to locks that are in progress. 2004-08-11 David Teigland - rebuild_freemem() was being called too early - the grantqueue must be checked for NOREBUILD locks when deciding if an rsb should be remastered better format for lock dump on assert change flag name to reflect expanded use basts don't apply to the "plock update" lock, so don't provide a bast function. also use the delete_lp routine to free it. 2004-08-10 Benjamin Marzinski link to the pthread library because magma needs it now. 2004-08-10 Patrick Caulfield Don't cleanr li_flags after setting the FIRSTLOCK bit Put some locking around the lockspace list. Tidy up how the user_ls gets freed. Fix a couple of error paths so they set errno. Use re-entrant gethostbyname2_r as we are juggling two hostent structures at the same time when starting multicast. previously the multicast address overwrite the local address as gethostbyname uses a static buffer. reinstate copy of lock name that got lost in the last change. 2004-08-10 David Teigland Make sure plock lp is off all lists before deleting, doesn't appear to fix bug, though. Only allow one lock_dlm thread to do process recoveries when they pile up. 2004-08-10 Adam Manthei fix a regex that ernoeously allowed the agent to access devices it shouldn't when using non admin users (see bug #129521) Fix for bug #129521 o modifies regex's so that non-admin accounts can navigate the menus o better invalid user/passwd detection 2004-08-09 Lon Hohberger Fix mutex deadlocks around cluster lock/unlock calls 2004-08-09 Patrick Caulfield Return error if we can't start all the listening sockets. Set return code if we can't leave the cluster. Userspace side of last checkin - sync routines for locking. Make sure this new library is only installed when you have the matching kernel changes available too. The "lets get all the API changes done in one day" checkin. Add Jeff (sorry, I've forgotten your last name)'s sync API call for userspace along with a separate astarg for blocking ASTs Use new ioctl interface to start the cluster. Replaced setsockopts with ioctls. 2004-08-06 Lon Hohberger Make ccsd give element name back with relevant cdata 2004-08-06 David Teigland plain make in cluster dir doesn't work, but make install should 2004-08-05 Lon Hohberger Remove old message encapsulation pieces and dependency on crc32.c 2004-08-05 Jonathan Brassow - take a stab at the cluster.conf.5 man page - incorrect synopsis - ccs is located in 7 not 8 ccs is located in 7, not 8 2004-08-05 David Teigland When granting locks, ignore locks on the wait queue that are also on the lockqueue. remove another assert A bunch of fixes related to dir lookup's, EEXIST errors from lookups, EINVAL errors from requests. Most prevalent when doing lots of flocks or plocks. 2004-08-05 forrest add libevent.* to RPM package CVS patchset: 3398 CVS date: 2004/08/05 06:39:08 --HG-- extra : convert_revision : b5e56f1784f3782af94ad72c288bc6dd67797922 2004-08-05 Jonathan Brassow - misc prog for magma. 2004-08-04 Jonathan Brassow - the sm plugin for magma seems to emit CE_NULL instead of CE_MEMB_CHANGE when a node leaves or joins... Therefore, we must reread the membership list even when receiving CE_NULL. - allow ccs to reload config file if the node is quorate (covers the case were ccs needs a restart) - return EINVAL if a cluster name is specified that is different than the one that is loaded. 2004-08-04 David Teigland try to clear up the build steps 2004-08-04 Patrick Caulfield Example init script for cman/dlm/fence/clvm Change the initialisation order, so if we get into the kernel twice by some strange behaviour, the error is less alarming. 2004-08-03 Benjamin Marzinski Removed the dependency on "all" from the "uninstall" target 2004-08-03 Jonathan Brassow - add cmd to remove old libs that were located in /lib (as part of the uninstall target) 2004-08-03 Michael Conrad Tadpol Tilstra - this is the opteron stack bug fix here too. don't think any one has seen it on this code. but it hurts nothing to just put the fix in right away. 2004-08-03 Patrick Caulfield Always copy ast bits, even on convert (kernel vesion). Always copy AST parameters, even for converts. Remove some redundant code. sockaddr_cl no longer contains anything that kcl_get_node_by_addr can use. 2004-08-03 David Teigland - when processing a combined delete and completion ast, do the deletion first. This avoids the next request coming and and finding the rsb between the ast and the release_rsb which can lead to other problems. - return -EEXIST from a dir lookup when the node looking up the res is already the master. A request needs to be processed differently when this happens. There are larger problems related to both of these, but these changes are legitimate even after the bigger changes that will make them less important. For now, this should allow some tests to run successfully or at least better. 2004-08-03 andrew Added entries for the crm/test dir and the helper script CVS patchset: 3378 CVS date: 2004/08/03 08:43:30 --HG-- extra : convert_revision : 2f7818d7ede9b0313a845beca4373bc86bfb861e 2004-07-30 Patrick Caulfield Do the parameter overrides correctly. Thanks for spotting that, Jeff! 2004-07-29 David Teigland update from src files 2004-07-28 David Teigland The lock counter used for DROPLOCKS callback wasn't accurate. It now tracks the "locks held" value reported by gfs_tool counters. These callbacks don't keep a lid on the number of locks but slows the growth anyway. They now operate above 10,000 locks, delivered once a minute. Send a DROPLOCKS callback to gfs once every minute while the number of outstanding local locks is over 20,000. Doesn't appear to be too effective but it might be in some cases. retry calls to ccs_connect if they return an error 2004-07-27 Jonathan Brassow - specifying a cluster name is only valid when the force command is used. 2004-07-27 Lon Hohberger Cleanup of useless variable 2004-07-27 David Teigland print the actual error value ccs returns 2004-07-27 Jonathan Brassow - must zero out rset variable before populating. 2004-07-26 Jonathan Brassow - when cman shuts down (as when cman_tool leave happens) close the connection and attempt to reinitiate connection. This may be what is causing rbz128571. - multicast option is not ready yet. Fail if it is specified. - add some man pages 2004-07-26 sunjd add OCF RAs config support CVS patchset: 3353 CVS date: 2004/07/26 09:06:08 --HG-- extra : convert_revision : 623a44b0d5e6c225eeed7d2f181ea44536212e38 2004-07-26 David Teigland Add locking around ls_requestqueue as it's accessed by both dlm_recoverd and dlm_recvd. tidy DLM_ASSERT statements Do reference counting on lockspace structs. Looping mount/umount operations on multiple nodes would quickly result in a ls struct being freed while dlm_recoverd was still finishing a previous recovery on it. 2004-07-26 Patrick Caulfield Clear the "cluster_is_quorate" flag at shutdown 2004-07-23 Jonathan Brassow - sometimes getsockopt would report that bcast was enabled when it wasn't. Since it doesn't hurt to set the flag regardless, we now set the bcast flag w/o checking to see if it is enabled already. 2004-07-23 David Teigland die if we cannot talk with ccs when starting up 2004-07-23 Patrick Caulfield Fix race on islistening queries. Pass the unqualified name into the kernel, so the cluster node name never has domain names attached to it. 2004-07-23 David Teigland - clvmd_fix_conf.sh usage was wrong - add info about building modules outside kernel tree - cluster.xml has changed to cluster.conf When a request completion is received in lock_dlm during recovery, the lock must be demoted to NL and rerequested in the originally requested mode after recovery completes (only NOEXP requests can be completed for gfs during recovery.) When a lock was rerequested, the QUECVT flag was always being set even though the lock may also have the EXPEDITE (gfs's PRIORITY) flag set. It's illegal to use EXPEDITE and QUECVT together, so we now only set QUECVT if EXPEDITE isn't. An assertion fails if both flags are used. remove a printk that can be annoying fenced was not correctly matching fqdn's from cman and basic hostnames from ccs 2004-07-22 Michael Conrad Tadpol Tilstra - decent start of documenting cmdline args and config file for gulm. still needs some work. - gulm will now get its configs from ccsd. I'll be adjusting man pages next so syntax will be there. 2004-07-22 Jonathan Brassow - update configure scripts to set %{libdir} to /usr/lib - make ccs use magma and magmamsg - change from using /etc/cluster/cluster.xml to /etc/cluster/cluster.conf - put in a forgotten unlock in an error condition in magma 2004-07-22 Patrick Caulfield Get rid of zero-initialisers as the compiler (or is it the linker? I always forget) does this for us. Include the quorate state in HELLO messages. 2004-07-22 David Teigland update from src files - Improve/fix the way we lock the rootres list. The two rwsem's we were using still weren't sufficient for cases where release_rsb was called while traversing the root list. Replace the two sems with one new "ls_root_lock". I hope this fixes the rootres list corruption I was seeing. It's much saner locking in any case. - Add some log_debug() statements for cases where dlm_unlock() returns an EINVAL error so we can figure out what went wrong if the lock_dlm assert fails. - Have next_move() detect the case where a first start on the ls is interrupted by a recovery event. I expect this will fix the related assert failure that was reported, although I've can't reproduce to confirm. on assert failure dump dlm's debug buffer An unlock request is waiting for a reply from a master node. That master node fails causing a recovery event. As of yesterday we simply treat these unlock requests as completed and don't rebuild them during recovery. Before queueing the completion ast for these unlock requests the lkb must be removed from the rsb queue and the lockqueue. An assertion failure was caused by not doing this. add BUG() to assertion initialize cluster_is_quorate to 0 2004-07-21 David Teigland update from src files 2004-07-21 Patrick Caulfield neaten up some bits by calling send_kill() rather than building up our own KILL message. 2004-07-21 David Teigland if remote_stage2() gets a request, it should return EINVAL if: - no rsb exists for the named resource - the rsb exists but the nodeid is -1 - the rsb exists with some remote master specified The last case wasn't recognized and was causing an assertion failure. (The last two cases together are equivalent to res_nodeid != 0.) 2004-07-21 Patrick Caulfield A couple of endian fixes. 2004-07-21 David Teigland In-progress unlock requests interrupted by a recovery event weren't always handled correctly. I also no longer get failed dir removals during recovery. 2004-07-20 Michael Conrad Tadpol Tilstra - reshaped the way i was building the lock key names. This new method avoids a limit of 8 bytes for node names. - CFLAGS in make/defines.mk is ignored if you don't use += in the Makefiles 2004-07-20 Patrick Caulfield Distinguish between not being able to get the cluster ACTIVE state, and the cluster actually being active. 2004-07-20 Ken Preslan Suiddir support suggested by anton@hq.310.ru. 2004-07-20 Patrick Caulfield Tidy up temp nodeids after a transition. 2004-07-20 David Teigland tidy a bit of code that decides if a new master should be looked up. makes it easier to experiment with new code in this area. if deserialise_lkb finds the lkb already exists, advance the pointer so we can correctly read the next item and print a debug message so we know if this is actually happening. pass the dlm_header into add_to_requestqueue so we can print the cmd recycle dir entries when rebuilding res directory instead of freeing and immediately reallocating thousands of them. 2004-07-20 Patrick Caulfield Don't clear the temp_nodeid until we've /really/ finished with it. should finally nail #126526 Fix race where two reads/accepts could arrive in quick sucession but only the first would be serviced. This fix does both reads AND accepts as opposed to the last one which only did reads. I reckon this should fix bug #126758 2004-07-20 David Teigland print rsb in remote_stage2 assertion 2004-07-20 Jonathan Brassow - remember to closedir() 2004-07-19 Patrick Caulfield Install man page Man page for cman_tool Print error if no nodeIDs are passed to "cman_tool kill" Tidy up accept path a bit. 2004-07-17 Ken Preslan Reordered munging of modes on inode create. 2004-07-16 David Teigland update from src files - the new MASTER rsb flag wasn't being set during recovery when an rsb from a departed node was remastered; this prevented the dir entry for the rsb from eventually being removed which made all further requests on that rsb invalid - take dirtbl lock when clearing dir during recovery even though it's not strictly necessary - save dir nodeid in release_rsb instead of calculating it twice 2004-07-16 Patrick Caulfield If there's anything left to read after recvmsg, then make sure we get it. oops, remove annoying (and badly formatted) debug prints Validate nodeIDs 2004-07-16 David Teigland dlm_unlock_stage2: copy new lvb from lkb before granting new locks to ensure they get the latest copy. Three different problems resulting from recent change to quit doing a convert-to-NL on a gfs unlock: - dlm_unlock wasn't passing VALBLK flag to write lvb on unlock - wake/complete for synchronous (internal) unlocks rather than gfs callback - lm_hold_lvb() wasn't preserving lvb contents since a dlm_unlock would free the lkb and rsb. Now if there's an lvb to preserve, convert to NL on a lm_unlock() and do dlm_unlock on lm_unhold_lvb(). 2004-07-15 Patrick Caulfield Fix some small odd bugs in startup conditions: - when taking over as master, make sure we always get a nodeid for the new node in a NEWNODE transition - Fix some potential oopses in error messages - Allow a JOINACK_NAK message to come after a JOINACK_ACK in case the ACKing node has to back down as master. 2004-07-15 Lon Hohberger Fix breakage in SM plugin when querying group membership w/o logging in; install in /usr 2004-07-15 David Teigland the recently changed version of release_rsb() checks the new RESFL_MASTER flag to decide if a dir_remove is necessary. rsb's remastered during recovery weren't having this flag set. get rid of the _recovery version of dir_lookup which only related to sequence numbers remove "resdata" from function names since we've renamed things - /proc/cluster/dlm_dir to dump resource directory - change 5s constant for recovery wait timer into config.recover_timer - simplify logic that decides if dir removal is needed - add more detail to some log_debug statements - a call to release_rsb() was missing in remote_stage2 after successful lookup of rsb with nodeid of -1 (where einval is returned) - change log_debug to log_all in dir removal errors which should now be very uncommon (if they occur at all) - add a call to schedule between an einval reply and a repeat dir lookup 2004-07-15 Jonathan Brassow - couple changes to make files 2004-07-15 yixiong Fix the "make rpm" issue for cms. CVS patchset: 3321 CVS date: 2004/07/14 22:53:41 --HG-- extra : convert_revision : 7dd957434146588caf17c0782ab111837d0e1806 2004-07-14 Patrick Caulfield Clean up the temp nodeIDs list at shutdown. 2004-07-14 David Teigland - change in the way name_to_directory_nodeid works. get rid of the bitmask step and just do a modulus on the hash value. the bitmask introduced an imbalance with some node counts that could cause a more heavily weighted node to be more prone to run out of memory (esp during recovery.) - change the way we handle the case where a granted message arrives before the reply to the original request. take the request out of the queue right away and assign the remote lkid instead of waiting for the delayed reply. this is a rare circumstance but removes potential for problems. 2004-07-13 Jonathan Brassow - add targets to the make file for updating subtrees to latest tags. 2004-07-13 Michael Conrad Tadpol Tilstra - 2004-07-13 Patrick Caulfield Make DLM_LSF_NOCONVGRANT the default for userland lockspaces. 2004-07-13 David Teigland update from src files 2004-07-13 Patrick Caulfield Missed a couple of files from the last tidying commit Remove resdir sequence number from all structs as it's no longer used. 2004-07-13 David Teigland Change the way lm_dlm_unlock() works. We previously demoted to NL on a lm unlock and then did a dlm_unlock() on the lm put_lock. Now do a full dlm_unlock() for each lm unlock. This avoids some odd problems in some gfs tests. We can reintroduce a convert-to-NL optimization later on if need be. 2004-07-13 Patrick Caulfield Fix an obscure but potentially nasty race on the resource directory. 2004-07-12 Ken Preslan Change the way a switch statement works to prevent GCC from doing stupid stuff on a PPC box. [ Arkadiusz Miskiewicz ] 2004-07-12 Michael Conrad Tadpol Tilstra - removed the unneeded utils_verb_flags stuff from lock_gulm.ko 2004-07-12 David Teigland mess with tabs (unaligned in cvsweb...) remove stray printk Naming changes. Get rid of struct typedefs and gd_ prefix. Improve some of the structure names. Use consistent hash table method for rsb/lkb/dir lists. 2004-07-09 yixiong a fix for "make rpm" error caused by empty directories. CVS patchset: 3318 CVS date: 2004/07/09 20:01:31 --HG-- extra : convert_revision : d6be04e6cbe1b2d3a36d08e890c28eeeb33d2a86 Taking out the cms stuff for now. CVS patchset: 3317 CVS date: 2004/07/09 19:23:27 --HG-- extra : convert_revision : fb9b5c566a990325b774682e573d3cb05dbf8418 2004-07-09 Benjamin Marzinski Fixed some login structure size issues, so that gnbd clients and servers can be different architectures. 32 bit servers can now have 64 bit clients, etc. Also cleaned up code to get rid of compile warnings on 64 bit machies. 2004-07-09 Michael Conrad Tadpol Tilstra - fixes bug #126970 ifelseifelseifelse doesn't work without the elses. 2004-07-09 David Teigland a recent checkin incorrectly switched unlock_stage2 to reference an lkb after it was potentially freed. - set lkb nodeid to -1 when it's created. this prevents an assert during recovery on an lkb for which the rsb master is being looked up (recent shift in assigning lkb nodeid exposed this) - add print_lkb/print_rsb to some existing asserts 2004-07-09 chuyee cms (ais message service) code initial checkin CVS patchset: 3309 CVS date: 2004/07/09 08:18:14 --HG-- extra : convert_revision : d445331eea4ffb868bbe942063dc021ad95a32c5 2004-07-09 David Teigland in dlm_lock_stage1 set lkb nodeid to the rsb's nodeid only after the rsb is certain to have the correct nodeid itself. - in unlock get the rsb after holding the in_recovery lock otherwise an in-progress request is not redirected after recovery - add an assert pjc found spot where lock struct wasn't being freed 2004-07-09 Adam Manthei diff -u -r1.3 fence_apc.pl --- fence_apc.pl 8 Jul 2004 22:06:34 -0000 1.3 +++ fence_apc.pl 9 Jul 2004 01:01:52 -0000 @@ -34,7 +34,7 @@ # should be more than 1 my $open_wait = 5; # Seconds to wait between each telnet attempt my $telnet_timeout = 20; # Seconds to wait for matching telent response -my $debuglog = '/tmp/apclog' # Location of debugging log when in verbose mode +my $debuglog = '/tmp/apclog';# Location of debugging log when in verbose mode $opt_o = 'reboot'; # Default fence action. 2004-07-09 Benjamin Marzinski added -fPIC to the makefiles, so the cman.so, gulm.so, etc.. would compile on x86_64 2004-07-09 Adam Manthei o added support for APC 79XX o additional error checking for masterswitch setups. Not specifying a a switch # to operate on will cause an error if there is more than one switch present. o whitespace cleanup 2004-07-08 Patrick Caulfield Stop users calling dlm_[ls_]_pthread_init() more than once. Reset fd flags on return from dlm_dispatch(). Add timeout feature for resdir entries so they don't go away as soon as the last lock is dequeued. You can configure the time using a config entry, set this to zero to get the old behaviour 2004-07-07 gshi implemented uuid as nodeid CVS patchset: 3307 CVS date: 2004/07/07 19:07:14 --HG-- extra : convert_revision : b3c35d55e5eafda3ab53c3b81b26e72e5854f798 2004-07-07 Patrick Caulfield Fix assertion. 2004-07-07 David Teigland loosen assertion to allow us to send an rsb lookup to ourselves this can legitimately happen during recovery by resend_cluster_requests update from src files - when an assertion fails dump all rsb's and lkb's to console - when a dlm_unlock() removes the last lkb from an rsb, reset the master nodeid in the rsb to -1 to force a subsequent dlm_lock() to look up the potentially new master. Add find_rsb_to_unlock() for this which holds a lock to make the test and set atomic wrt new dlm_lock()'s. - lots of assertions added - lots of log_debug()'s added (/proc/cluster/dlm_debug) 2004-07-06 David Teigland remove and clean up some debugging - Patrick's fix to set rsb nodeid to -1 during unlock crept into a previous checkin. - Modify release_rsb() to cope with -1 nodeids and properly remove resource directory entries. 2004-07-05 Patrick Caulfield Tidy atomic decrement. Add support for more VMS-like locking mode where new locks will not be granted if there are already locks waiting for conversion. This is enabled per lockspace and is OFF by default. Fix a couple of 64bit compiler warnings. Add in more architectures. Note: this is /not/ a list of supported architectures, in fact very few other than i386 have been tested (and then not for some time). But fixes for other architectures are welcomed. 2004-07-05 David Teigland minor changes to resource directory lookups. - increment rd entry's seq number safely by holding the rd lock - don't pass rd structs to callers since there's no reference counting on them; just return the nodeid and seq values instead. safer and cleaner approach. - add debugging code which will be reduced when current problems are sorted out (strname used for debug prints will be removed) 2004-07-03 Benjamin Marzinski Fixed some annoying bugs. Previously, when you unimported a gnbd (with gnbd_import -R or -r) you could try to kill a gnbd_recvd process that was already dead.. This would result in you killing the pid -1, and bad things happening. Also, gnbd_monitor wasn't accepting connections, which was causing commands to hang. This problem was introduced in fixing an earlier bug. Both these issues are fixed. 2004-07-02 Jonathan Brassow - work around broken exit 2004-07-02 David Teigland fix log message better log message include directory sequence numbers in prints another warning and assertion a bunch of assertions to catch errors early on 2004-07-02 Benjamin Marzinski Changed the /dev/gnbd_ctl node I make from major 11 to major 10, which is what misc devices are. I don't know why I haven't seen any issues with this until now.... As far as I can tell, gnbd_import shouldn't have been working at all. If anyone has been seeing gnbd_recvd[8311]: ERROR [gnbd_recvd.c:190] cannot open gnbd control device : No such device or address messages, they should go away now. 2004-07-01 Lon Hohberger Rename variables/macros/structures to not use leading __ 2004-07-01 Patrick Caulfield Make sure that the backout time for a "NEWCLUSTER" message is less than the joinwait time. 2004-07-01 Benjamin Marzinski initialize the polls array. This will make sure that gnbd_monitor realizes that there is free space to add a new connection. updated gnbd kernel patch to include recent changes 2004-06-30 Ken Preslan Bring patch uptodate with the sources. 2004-06-30 deng.pan add directory for checkpoint test cases CVS patchset: 3279 CVS date: 2004/06/30 03:48:43 --HG-- extra : convert_revision : 1b2e148ee7943fd3544e1640040dff7633502a43 2004-06-29 Jonathan Brassow - change log code to not print log_msg if verbose is not set. - fix problem where ccsd exits if cman_tool leave is performed. If a cman_tool leave occurs, ccsd will drop into non-quorate mode - disallowing connections unless they are forced - which should only be done by cluster managers. 2004-06-29 Lon Hohberger Make dumb driver act like there's a real, one-node cluster. 2004-06-29 Benjamin Marzinski missing "_" in gnbd_monitor... caused gnbd_import to fail with uncached devices devfs was set to use /dev/gnbd/, but gnbd_import sets the gnbds cluster unique name in /dev/gnbd/.. This was causing problems for devfs users. The devfs links for gnbd devices will now be in /dev/gnbd_minor/ These should never be used, since this device name is not consistent across the cluster. 2004-06-29 David Teigland update from src files Fix the way we do lkb deletions. The DELAST flag is removed and a delete arg is passed to queue_ast() when the lkb should be deleted following the ast delivery. In the past, setting the the DELAST flag on an lkb could result in it being detected by an in-progress deliver_ast() and the lkb being deleted early. I'm hopeful this was the cause of the oopses we'd see in process_asts(). include lockspace.h 2004-06-28 Benjamin Marzinski fixed obvious error.. If the gnbd_ctl device wasn't present, I was trying to remove it, causing: gnbd_import: ERROR cannot remove /dev/gnbd_ctl : No such file or directory 2004-06-28 Michael Conrad Tadpol Tilstra - this is the idea I have for having gulm use ccs. This code here compiles, but has not been tested. Further, gulm will not use ccs, unless you compile with -DREADYFORCCS I am a little leary about some of the string fiddling I did, which is why I want to test it first. But being that I'm going on vacation soon, I wanted to leave you all with this code. Feel free to play, but do no be suprised by bugs. - fiddled with the symbols exported by libgulm. 2004-06-28 Lon Hohberger Display node IDs as 64-bit hex Fix wrong-side of ipv6 addr->node ID assignment Fix build problems breaking open(/usr/lib/magma/plugins, O_DIRECTORY...) Fix build problems; fix plugins to not use cml_free 2004-06-28 Adam Manthei Addresses an issue that was seen w/ the APC Masterswitch. logins can fail if the APC is logged into too fast. This update does two things to reduce the likelihood of this bug: 1. It trys to logout cleanly instead of just closing the telnet connection 2. retrys twice upon login failure. There is a bug in the APC that will cause it to reset itself from time to time. There doesn't appear to be anything that we can do about that though. Thanks (and blame! :) go to Erling for helping resolve the issue. 2004-06-28 Lon Hohberger Magma plugins API fixes for GuLM, CMAN, CMAN/SM 2004-06-28 Patrick Caulfield Some older systems need explicit -lpthread 2004-06-28 David Teigland allocate name space for max length, not actual length sm_ioctl's copy_from_user always copies the max length in register use copy_from_user to get the register name always copy in max name length and then find string len put lockfile in /var/lock/ by default some tidying and improve error messages 2004-06-26 David Teigland change build/install order 2004-06-25 Ken Preslan Switch to using a fixed version of OGFS' shell sort. Make sure the VFS ACL code is compiled in if GFS in enabled. 2004-06-25 Benjamin Marzinski The GNBD clients and servers now distinguish eachother via nodename, instead of IP. This was keeping gnbd from working smoothly with gulm on machines with one ip interface for lock traffic and one for block traffic. Also it removes some ipv4 to ipv6 headaches. The clients now connect to the server via ipv4 or ipv6.. ipv6 prefferentially. You can force a connection from the client to the sever to go over a specific path by using the ip address instead of the hostname. A bunch of code cleaning.. removing duplicated functions and such. Two related issues that still need to be done.. The server only accepts ipv4 connections now, but this is trivial to fix, and fence_gnbd needs a parameter switch, from ipaddr to nodename. These will get checked in shortly. 2004-06-25 Jonathan Brassow - check in patch for making links correctly in make file ~ patch by Arkadiusz Miskiewicz 2004-06-25 andrew Slight variation on horms' approach which also allows for a centrally defined definition of where the CRM dir is. CVS patchset: 3261 CVS date: 2004/06/25 10:57:18 --HG-- extra : convert_revision : b3c085608a6d76fd8b5389767d369eb24645e07d 2004-06-25 David Teigland ChangeSet@1.1683.1.1, 2004-06-25 11:19:18+01:00, patrick@jeltz.pjc.net Fix oops in dlm_query. 2004-06-25 Patrick Caulfield Create /dev/misc if it doesn't exist. 2004-06-25 David Teigland ChangeSet@1.1684, 2004-06-24 23:22:05+08:00, teigland@redhat.com DLM: fix compile error (undefined proc_ls_name) if PROCLOCKS config is off ChangeSet@1.1681.1.1, 2004-06-24 05:14:16-05:00, pcaulfie@tng2-1.lab.msp.redhat.com membership.c: Save or create a temp nodeid for a new node that joined via a NOMINATE message. should fix bug #126526 cnxman.c: endian fix for temp nodeids cnxman-private.h: Prototype for new_temp_nodeid() 2004-06-25 Ken Preslan HCH's suggested cleanups to the GFS mount/unmount code. 2004-06-24 Jonathan Brassow - you meant clu_fence not cp_fence, I assume. 2004-06-24 Lon Hohberger Fixes from Ben Marzinski 2004-06-24 Jonathan Brassow - update make files - make gnbd buildable outside kernel - add ability to compile outside kernel - requires dlm-kernel and cman-kernel are installed 2004-06-24 Lon Hohberger Replace missing clu_fence() API call. 2004-06-24 Jonathan Brassow - targets to build outside the kernel. - requires that cman-kernel has been installed 2004-06-24 David Teigland ChangeSet@1.1682, 2004-06-24 13:34:54+08:00, teigland@redhat.com DLM: newly added log_debug() in next_move() used gr pointer after freeing 2004-06-24 Alasdair G. Kergon initial checkin Initial checkin. 2004-06-24 gshi hbaapi.h is dependent on time.h AC_CHECK_LIB() should be put in front of CFLAGS resetting to include CC_WARNING CVS patchset: 3253 CVS date: 2004/06/23 23:20:17 --HG-- extra : convert_revision : 225e26136f78f44caa40973ed5f42e2d98b73427 2004-06-22 horms Use instead of perl CVS patchset: 3249 CVS date: 2004/06/22 10:08:11 --HG-- extra : convert_revision : b7467f195ff1278a2c5800ce873ee3bd383b8598 I have gone over the whole tree and made it so all, and I mean all source files should now be included in the distrubted tar ball. Curiously I discovered some code, the snmp_subagent (snmp_subagent/), and perl bindings (lib/bindings/) that would never be incuded in the tar ball, no matter what. Also, lmb, I had to do some surgery to lib/bindings/Makefile.am to inculde a bunch of extra files in the tar ball and make it build under makedistcheck. You might want to doble check what I have done there. CVS patchset: 3248 CVS date: 2004/06/22 09:04:31 --HG-- extra : convert_revision : 40341a3adad120a765305938434845862a84bddb 2004-06-18 gshi check if there is hbaapi headfile and library files omit or compile hbaapi accordingly CVS patchset: 3240 CVS date: 2004/06/18 18:27:58 --HG-- extra : convert_revision : d8329ec213e67f9dcf4c1a7df85e27dc85882a6d 2004-06-07 msoffen Cleaned up test for sys/pstat.h and added creation of include/crm/common/Makefile CVS patchset: 3195 CVS date: 2004/06/07 21:25:48 --HG-- extra : convert_revision : 894c15c7dcb18392ac1753ce9b48e23d0a3fdc31 2004-06-04 sunjd for configuring new subdir tools CVS patchset: 3180 CVS date: 2004/06/04 02:35:51 --HG-- extra : convert_revision : 7746dbd4c4053e6c29c2d8e63fafc955e97a898f 2004-05-24 deng.pan Make checkpointd a configurable module. Default is not configured CVS patchset: 3122 CVS date: 2004/05/24 06:12:26 --HG-- extra : convert_revision : 2966e9a3bf0cd7ecb635a6af0d229d0894403eb0 2004-05-20 horms The correct SNMP libraries to link against are required for the apcmastersnmp to link correctly, which is built if either the ucd-snmp or net-snmp headers are found. If they are not present then apcmastersnmp will build but will be unusable. CVS patchset: 3114 CVS date: 2004/05/20 06:11:30 --HG-- extra : convert_revision : 34d73159e44072b9a2d0da9e69063b65ebb17893 Fixed anomoly where IPv6addr is in the STABLE_1_2 branch but not the HEAD branch CVS patchset: 3110 CVS date: 2004/05/20 02:46:34 --HG-- extra : convert_revision : 57561606f8da322a4e722325962504aa33befb85 2004-05-19 msoffen Moved no-strict-aliasing check with the other flags. CVS patchset: 3107 CVS date: 2004/05/19 13:40:47 --HG-- extra : convert_revision : 0a53c8fb85ff70f3364429e278a89250ebddd0f4 2004-05-18 gshi adding a line to generate contrib/mlock/Makefile CVS patchset: 3102 CVS date: 2004/05/18 21:44:21 --HG-- extra : convert_revision : 34e0fb13115a4ddf0a354f6498a87eac10035743 2004-05-12 msoffen Added AC_CHECK_HEADERS for tcpd.h CVS patchset: 3068 CVS date: 2004/05/12 16:35:59 --HG-- extra : convert_revision : d1a0053ec26ec4eac2a01f82d025224c3957ff01 2004-05-06 andrew It *is* an error not to have UUID available when --enable-crm or --enable-lrm is specified for configure. (This change was linux tested) CVS patchset: 3011 CVS date: 2004/05/06 15:16:43 --HG-- extra : convert_revision : cd4f3df965124d16d80ff0a3299b6d040928095c 2004-04-28 msoffen Fixed snmp-subagent to build in separate build tree and to not fail if not available (just disable option) CVS patchset: 2946 CVS date: 2004/04/28 16:49:20 --HG-- extra : convert_revision : 5338c7d1d6791b0448e9005d1d1b8774aeaa9dc6 2004-04-26 msoffen Fixed to properly disable the snmp subagent CVS patchset: 2928 CVS date: 2004/04/26 19:35:00 --HG-- extra : convert_revision : 0b5c577de3c2fdcfadb30af7d5bdd773f30ec8cd Changed so that if trying to compile in snmp-subagent and you don't have the libs, its a warning and just disables the snmp-subagent build and continues. CVS patchset: 2927 CVS date: 2004/04/26 18:53:58 --HG-- extra : convert_revision : e0df93685a1ba3937973776fe9c2367a2f010eb5 Fixed so that snmp is optional ( only when enable-subagent is active) CVS patchset: 2926 CVS date: 2004/04/26 15:26:12 --HG-- extra : convert_revision : 26503c39a3736a61bbeac419c9fef7760442749a Changed to look for glib*-config if pkg-config isn't available CVS patchset: 2923 CVS date: 2004/04/26 12:29:51 --HG-- extra : convert_revision : b9e0d457d81358b5460da8c1a76e6cf6fce41ef2 2004-04-23 horms The lib/uuid/ directory is empty in CVS, so this breaks the build CVS patchset: 2913 CVS date: 2004/04/23 01:49:21 --HG-- extra : convert_revision : 2c955c9564e21cc0d077fdcfb6fcfb924dcf95fc Twaked UCD SNMP Librady detection a little more CVS patchset: 2912 CVS date: 2004/04/23 01:37:23 --HG-- extra : convert_revision : 753cdfac4f8640fdeeb5f9303cac91ece61bff30 2004-04-22 msoffen Added several AC_MSG_RESULTS, cleaned up detection for swig functionality CVS patchset: 2911 CVS date: 2004/04/22 21:28:51 --HG-- extra : convert_revision : 2f8459776264bd72a65da528546296e8bf4f3a4c Changes to identify uuid lib and added pkg information CVS patchset: 2910 CVS date: 2004/04/22 21:24:45 --HG-- extra : convert_revision : 21b787d7b13a9b0bc4f65c7bd2003d6b1308e1bb 2004-04-22 andrew Handle the lack of libxml2 support a little more nicely CVS patchset: 2908 CVS date: 2004/04/22 07:06:59 --HG-- extra : convert_revision : e0375ce93dacc6c10d38c451c480cf750e3b7ac0 Insert missing "test" part of snmp check CVS patchset: 2907 CVS date: 2004/04/22 07:00:49 --HG-- extra : convert_revision : 58245c2f8b956a3f5bc21f1e070eca35d9df06f7 2004-04-21 andrew Further changes to allow the next-gen pieces to compile on MacOS X * Renamed shutdown to (crm|cib|tengine)_shutdown to avoid clashing with someone else's definition. Something to do with flat namespaces in OSX. * Renamed config to lrm_config in lrm/clientlib.c for the same reason above. NOTE: I dont see this used anywhere.... whats it for? * Made lrm_api.h include portability.h for its definition of uuid_t * Added some include directories so configure can find uuid.h and others * Use the mach init.d location. We'll need to write a mach init script one day... CVS patchset: 2905 CVS date: 2004/04/21 18:50:28 --HG-- extra : convert_revision : 09466a79bbe27993f0414364c0941bf39acb8530 2004-04-21 msoffen SNMPCONFIG is needed for FreeBSD to compile Stonith/etc. that use snmp-config CVS patchset: 2902 CVS date: 2004/04/21 14:42:14 --HG-- extra : convert_revision : 51f4a88fb39d3aa8f08ad80b9bdbf36efe0e99d5 2004-04-20 andrew Changes to allow for compilation on Darwin (aka. MacOSX) CVS patchset: 2896 CVS date: 2004/04/20 21:21:19 --HG-- extra : convert_revision : 0177b0fbaf845136714ba0440cc52049ec75e51a 2004-04-20 msoffen Added logic checks for uuid and mail/mailx. Added replacement function for strndup. CVS patchset: 2887 CVS date: 2004/04/20 19:51:03 --HG-- extra : convert_revision : 0ffeb09f9cb86e3afccdb87d88c4ee6f36c1eed2 2004-04-20 horms There was a bogus test for ucd-snmp-config, which as far as I can work out doesn't exist. This resulted in the apcmastersnmp stonith module not being linked against an SNMP library on systems with UCD SNMP. This patch should leave the behaviour for systems with Net SNMP intact, that is use net-snmp-config. But for UCD systems it will test to make sure that libsnmp or libucdsnmp (in that order) are present and if not produce an error. If they are not in the library path they can be added to LDFLAGS. CVS patchset: 2884 CVS date: 2004/04/20 09:43:06 --HG-- extra : convert_revision : 7f7e036dc611acb3535a6167b2c775cd4fa46f9f 2004-04-18 alan Put in lots of changes releated to credential structures CVS patchset: 2859 CVS date: 2004/04/18 19:20:10 --HG-- extra : convert_revision : f514f0a26a80bf22f5f195bdb08c0dee2d1149ed 2004-04-13 gshi rename ha_takeover to hb_takeover to keep name consistency with hb_standby CVS patchset: 2828 CVS date: 2004/04/13 19:13:20 --HG-- extra : convert_revision : f89a549036d119013e920531dc35b10960d00519 2004-04-10 gshi add heartbeat/rc.d/ha_takeover into generated files list CVS patchset: 2804 CVS date: 2004/04/09 22:33:20 --HG-- extra : convert_revision : 413a30dd358c681e8d70c5b1fe673f4626e5eda8 add heartbeat/lib/ha_takeover into generated files CVS patchset: 2802 CVS date: 2004/04/09 22:13:59 --HG-- extra : convert_revision : 1c3969d01715414acdd83127a3d41b79b2d26723 2004-04-06 andrew LRM Warning Messages not printing correctly "if" test had gone missing CVS patchset: 2780 CVS date: 2004/04/06 12:14:56 --HG-- extra : convert_revision : d1e279830dde0e70485175ce87c6a59967494360 Make the enabling of lrm if crm is specified work Do it in a better place CVS patchset: 2779 CVS date: 2004/04/06 11:05:28 --HG-- extra : convert_revision : e8bb0d9264251d15fcdfc1bb55324b78270feede Fake --enable-lrm=yes if --enable-crm is specified Complain loudly if --enable-lrm=no but --enable-crm=yes Use the AC_CONFIG_FILES macro to avoid constantly regenerating Makefiles CVS patchset: 2778 CVS date: 2004/04/06 09:49:22 --HG-- extra : convert_revision : 6dae3c0475ec314065e2ff2589de491373dcb684 2004-04-05 andrew Variables used in Makefiles must always be defined else configure breaks on some platforms. Removed some unused configure variables. CVS patchset: 2769 CVS date: 2004/04/05 15:26:28 --HG-- extra : convert_revision : ca681312299e3ab1d76567a9610b1e84a508b575 2004-04-03 alan Fixed a test for linux/ CVS patchset: 2766 CVS date: 2004/04/02 22:39:07 --HG-- extra : convert_revision : 734935cb31f725500626aed685dc3e02f2d8cdc0 2004-03-26 andrew + Fix typo in lrm section - Remove deleted lrmd directory from compile list CVS patchset: 2707 CVS date: 2004/03/26 14:31:36 --HG-- extra : convert_revision : 6db7e38154ef894c11815ea51268f1317b479ff1 2004-03-26 alan Added the Local Resource Manager from Zhen Huang and Jiang Dong Sun to CVS. CVS patchset: 2686 CVS date: 2004/03/26 05:00:46 --HG-- extra : convert_revision : a6f6b3303f44d0bb6768e37e43e5aa650015119c 2004-03-25 alan Various changes. Mostly for moving libraries, preparing for glib 2.0, and detecting the watchdog header files... CVS patchset: 2668 CVS date: 2004/03/25 08:08:00 --HG-- extra : convert_revision : 876eb848f98a9d91e57a7dd9aaf236d2f9be3ba5 2004-03-24 yixiong Adding checks for unistd.h and stdint.h so lib/plugin/HBauth/sha1.c can be compiled on RH 7.3 CVS patchset: 2665 CVS date: 2004/03/24 21:59:43 --HG-- extra : convert_revision : b0ea96c1248062e6353532ef742f47f29063f188 Adding a --enable-eventd option to configure.in. The Eventd is not compiled by default. CVS patchset: 2664 CVS date: 2004/03/24 20:52:13 --HG-- extra : convert_revision : 2932ca8c3e96a16e74166b5ec3fe4b2dcd248b65 Disable building the eventd for now. CVS patchset: 2660 CVS date: 2004/03/24 19:20:17 --HG-- extra : convert_revision : 04ff346a547bdfd739149d8635f23a795216a4fe Disable building the eventd for now. CVS patchset: 2659 CVS date: 2004/03/24 19:17:01 --HG-- extra : convert_revision : 530e16c91f6fed3f77dd8c303f4445cf080d5aa2 2004-03-19 forrest check in the AIS event service code CVS patchset: 2619 CVS date: 2004/03/19 07:21:54 --HG-- extra : convert_revision : c3466d3170ab4c4157490cb29b89a7078cae572d 2004-03-18 lars Make recent gcc not complain about dereferencing type punned pointers. CVS patchset: 2590 CVS date: 2004/03/18 09:43:58 --HG-- extra : convert_revision : 4a302f9c642d7f70768ef857d3ae679a558dcf64 2004-03-09 lars Restructure SWIG test to be more robust, also default to not build with SWIG by default. CVS patchset: 2543 CVS date: 2004/03/09 08:50:31 --HG-- extra : convert_revision : 11e18e2bfcedad5c89ecb1cfaa4771a13bb3a4e1 2004-03-04 lars Work around versions which are completely different. CVS patchset: 2525 CVS date: 2004/03/04 21:26:22 --HG-- extra : convert_revision : 76ba5b9c5b75c823dcf17322af66204945926bac Require a SWIG version 1.3.x, with x >= 19. CVS patchset: 2523 CVS date: 2004/03/04 20:08:01 --HG-- extra : convert_revision : 2173d6bf3e84b41fb05667452df03714771683df Vendors / distributors need to install Perl modules differently from regular administrators. CVS patchset: 2519 CVS date: 2004/03/04 17:20:15 --HG-- extra : convert_revision : 4aff437c9c21a7956cab8413c90bb2321c0f9516 Merge SWIG/Perl5 bindings. CVS patchset: 2517 CVS date: 2004/03/04 11:33:11 --HG-- extra : convert_revision : 048af828306a81d8df068b985fd7e3c92617950b 2004-02-23 horms Added SendArp resource CVS patchset: 2501 CVS date: 2004/02/23 07:53:50 --HG-- extra : convert_revision : d6255736b9040bf86a94d6d7e10c154bb3980c1e 2004-02-17 lars Moving CVS HEAD to 1.3.x CVS patchset: 2488 CVS date: 2004/02/17 21:16:42 --HG-- extra : convert_revision : 5d1b2b3aaff19badd6863e0d234b9c944ba69ba9 2004-02-17 alan Updated changelog and version number in preparation for version 1.2.0 to come out. CVS patchset: 2482 CVS date: 2004/02/17 05:50:30 --HG-- extra : convert_revision : 3faceb449e079f0b34e1e13221c8adef106b7fb3 2004-02-11 alan Put in change log updates and changed the version number in preparation for the next beta. CVS patchset: 2461 CVS date: 2004/02/11 03:27:13 --HG-- extra : convert_revision : d92c52afa6842d4aec5222918fc444a9448e1c7f 2004-02-08 alan Committed some CTS test improvements from Mi Jun CVS patchset: 2439 CVS date: 2004/02/08 10:08:40 --HG-- extra : convert_revision : bd062f362880686add5492b250d45ff7287928e4 2004-02-06 horms Removed bogus version (my bad) CVS patchset: 2428 CVS date: 2004/02/06 07:22:20 --HG-- extra : convert_revision : b527d8b53fe68ddc25e76f209b94ef7a930691a0 Fixed duplicated global definitions CVS patchset: 2427 CVS date: 2004/02/06 07:18:15 --HG-- extra : convert_revision : d1fd53da4b3acc1b24b6e29fe3c6c9d401ee7efe 2004-02-05 msoffen Changes for getlon on freebsd if library is installed and added constant to let getopt_long to be used. CVS patchset: 2421 CVS date: 2004/02/05 21:39:09 --HG-- extra : convert_revision : 3031182b7ae50f394e8ffa424f55e5b426bb4283 Added daemon as a replacement function. CVS patchset: 2411 CVS date: 2004/02/05 15:02:55 --HG-- extra : convert_revision : 6f2a21327225f276063fa77c9d62f1e08fe81dec 2004-02-04 andrew cleaned up the declaration of CRM related variables verified that crm/* is not considered if --enable-crm is not specified CVS patchset: 2401 CVS date: 2004/02/04 22:01:14 --HG-- extra : convert_revision : 2e25138bd332f7e9926884110ac32931f4baaea1 2004-02-04 msoffen Corrected tests for netinet stuff and made it properly test headers on Solaris/FreeBSD CVS patchset: 2398 CVS date: 2004/02/04 17:50:54 --HG-- extra : convert_revision : 0ae8f4e7efa359d048410c2210e809aabba1b1d4 2004-02-03 andrew handle malloc debug makefile options properly/centrally CVS patchset: 2388 CVS date: 2004/02/03 22:28:06 --HG-- extra : convert_revision : 9e558af79708aaeb9f1b1c71c5b49d109174949a 2004-01-27 yixiong I am not sure how long we've been missing this double quote. It is obviously needed. CVS patchset: 2341 CVS date: 2004/01/27 20:23:09 --HG-- extra : convert_revision : f1cce0c55e1216cb8b1292a524e285b26fa4d1a0 2004-01-23 alan Updated version number, changelog. Added a missing library to the heartbeat spec file for the data checkpointing API. CVS patchset: 2331 CVS date: 2004/01/23 05:31:21 --HG-- extra : convert_revision : f77866a0e88c1c6be1d1fdf752a1a84553ae3ead 2004-01-22 lars CRM cleanups by Andrew. CVS patchset: 2326 CVS date: 2004/01/22 12:24:21 --HG-- extra : convert_revision : b9b60111d3ac0ef8c08ac6c22960c1ee399f3f0a 2004-01-15 lars Fixing the compile again; CRM_DIR was getting checked for after the substitution, sorry. CVS patchset: 2294 CVS date: 2004/01/15 16:58:31 --HG-- extra : convert_revision : 858e70a37fe23de640a5c206df3b6506774ee872 2004-01-14 lars Initial CRM merge. Not compiled in by default, nothing changes if not activated. Not functional yet! CVS patchset: 2291 CVS date: 2004/01/14 12:00:30 --HG-- extra : convert_revision : da866ce8714bc067de8bc9d54cfff4afc5b4c051 2003-12-21 horms Construct non-fatal cflags for use with code generated by lex and yacc CVS patchset: 2276 CVS date: 2003/12/21 11:49:56 --HG-- extra : convert_revision : a4ab77e486f230aee2b648934387dc2f8c37369a Don't used type-pruned pointers. Fixed some uninitialised variables CVS patchset: 2275 CVS date: 2003/12/21 11:18:37 --HG-- extra : convert_revision : 5c709818c637a247c3d6f56e9f037b4caf402927 Allow user to-defined CFLAGS to be used CVS patchset: 2268 CVS date: 2003/12/21 07:49:05 --HG-- extra : convert_revision : 3718d22c019fe926ae623f8e964d689f141023c0 2003-12-19 alan Fixed the configure.in file so we can enable dmalloc as a configure option. CVS patchset: 2264 CVS date: 2003/12/19 16:35:58 --HG-- extra : convert_revision : c106a62a6c897c0f9d09cb1f7ae03f79d6600cc8 2003-12-17 alan Put in various small fixes necessary to make an rpm properly. CVS patchset: 2259 CVS date: 2003/12/17 18:46:15 --HG-- extra : convert_revision : a2a160dfd5dd4eba8f595c59f9d717bde30a3fe2 Incorporated a large set of new function for the linux-ha project. This new set of code is the checkpoint daemon from Pan, Deng . It implements the SAF data checkpoint facility. CVS patchset: 2256 CVS date: 2003/12/17 16:52:54 --HG-- extra : convert_revision : 2b83ad236e5b7a1abc9a873ab308b39d6131c838 2003-12-12 alan Added a real replacement strnlen function. Should actually be pretty fast too... CVS patchset: 2251 CVS date: 2003/12/12 18:23:28 --HG-- extra : convert_revision : a67f70cb16ce435c62dd690da2d9dd4429bc7973 Added a check for the strnlen function. This is important since the code has depended on it quite a bit for a long time, but it hasn't been testing for it. So, it was always using strlen instead of strnlen. This is a bad thing. CVS patchset: 2249 CVS date: 2003/12/12 05:27:38 --HG-- extra : convert_revision : 47a406286fc5593f195eb276d40ad3213b22a9e0 2003-10-15 horms oops CVS patchset: 2212 CVS date: 2003/10/15 17:15:12 --HG-- extra : convert_revision : 3e4b1b44e311e944c3f1ef5f52baa3e52adf38ac Added IPaddr2 CVS patchset: 2211 CVS date: 2003/10/15 17:05:23 --HG-- extra : convert_revision : 66d6c93fd690000dd816b6c1851e41b98c25f3c7 Move utilities that link against libnet (sendarp and get_hw_addr) to their own directory so we no longer have to come up with creative ways to selectively mangle CFLAGS. CVS patchset: 2206 CVS date: 2003/10/15 09:52:11 --HG-- extra : convert_revision : 130aa5debb93475c02db60f28e369e7398de1fc0 2003-10-14 alan Added Yixiong Zou's IPMIlan STONITH plugin to CVS CVS patchset: 2205 CVS date: 2003/10/14 18:32:48 --HG-- extra : convert_revision : 45a2fd71a2ff697924462e7966ce093d9dcd1aa2 2003-09-19 alan Updated the version string to 1.1.3 - in preparation for the next release. CVS patchset: 2172 CVS date: 2003/09/19 19:14:09 --HG-- extra : convert_revision : 5a94a5d6a660c51400c107574ea7469881165a2b 2003-08-15 horms Tidy up distclean a bit. Zhu, Yi CVS patchset: 2157 CVS date: 2003/08/15 04:38:49 --HG-- extra : convert_revision : c8f63ce1a0c3747f0d78ac901c65fd6436ffc59a 2003-08-06 horms fixed fatal-beatings CVS patchset: 2139 CVS date: 2003/08/06 06:58:40 --HG-- extra : convert_revision : 45f20e4c02752f679d849f66f5178626cbd0d3b0 2003-07-31 alan Added the Xinetd resource script to CVS Added the OCF wrapper resource script to CVS CVS patchset: 2135 CVS date: 2003/07/31 13:34:01 --HG-- extra : convert_revision : 1606e5debf848918b63d5da3245320b2626dc5cd 2003-07-09 alan Changed configure.in to test the legality of warning flags before enabling them. CVS patchset: 2100 CVS date: 2003/07/09 14:13:17 --HG-- extra : convert_revision : 6aee4c59464732c3d7f7b3d9f55a0be33dceac9d 2003-07-01 alan Added the portblock resource script to the configure.in file. CVS patchset: 2070 CVS date: 2003/07/01 16:40:13 --HG-- extra : convert_revision : c103bb7c8dbab6262bbb6f3ea411b6017561db09 2003-07-01 msoffen moved one flag to Linux only ( -Wmissing-format-attribute ) as it does not work on FreeBSD or Solaris. CVS patchset: 2067 CVS date: 2003/07/01 15:36:39 --HG-- extra : convert_revision : 57f029a22172211340fd2874b1f8a631ac820c03 2003-07-01 alan Added more flags for warnings. + -Wbad-function-cast \ + -Wformat-security -Wformat-nonliteral \ + -Wmissing-format-attribute \ This all compiles correctly now -- though I may have a little testing to do for STONITH/baytech module. CVS patchset: 2056 CVS date: 2003/07/01 02:30:46 --HG-- extra : convert_revision : 39041f1cc7ab161d6d761512cb059c259d57697a 2003-05-21 alan Incremented version number... CVS patchset: 1929 CVS date: 2003/05/21 19:09:25 --HG-- extra : convert_revision : af9da118c164756a4aaef1d39e8b4b8c1624449c 2003-05-20 alan Got rid of the CLK_TCK stuff from configure.in and related places... CVS patchset: 1919 CVS date: 2003/05/20 05:55:10 --HG-- extra : convert_revision : 77af66b6f830fd6d3d2fcfad8b5e08ea1e923a41 2003-05-05 alan Put in code so that we know the path to the "strings" command. CVS patchset: 1875 CVS date: 2003/05/05 11:36:51 --HG-- extra : convert_revision : 77820db2e50200f049dddfe0a781fc8421cc7258 2003-04-30 alan Put in code to support the Dell remote access controller as a STONITH device. CVS patchset: 1852 CVS date: 2003/04/30 21:04:08 --HG-- extra : convert_revision : fe72df649645d770b98e67c27b4c983aaf270b14 2003-04-18 msoffen Cleaned up configure.in (Removed un needed vars). Added unsetenv (Not present on Solaris) from FreeBSD code. CVS patchset: 1832 CVS date: 2003/04/18 16:42:54 --HG-- extra : convert_revision : 27b9f215446083eecfd55ac222b501addb2d880b 2003-04-16 alan More process-restructuring stuff CVS patchset: 1810 CVS date: 2003/04/15 23:11:49 --HG-- extra : convert_revision : dc6b55b55f2e453cfbc9c1345de088031643ed9f 2003-04-15 msoffen Changes to get heartbeat/lib/BasicSanityCheck.in running on Solaris. CVS patchset: 1800 CVS date: 2003/04/15 17:57:04 --HG-- extra : convert_revision : ee1266f6ff931ec67fcc9f23f6b46e575c99ebb2 2003-03-31 msoffen Changes to get things to build properly under FreeBSD/Solaris. CVS patchset: 1773 CVS date: 2003/03/31 06:30:01 --HG-- extra : convert_revision : 5f4e0ea86967b064b9734761e251f4d98e63a152 2003-03-24 horms Replaced README with README.in CVS patchset: 1747 CVS date: 2003/03/24 09:18:57 --HG-- extra : convert_revision : de56317747af545df3f2d6f63d42adc13df8abbd 2003-03-19 alan Got rid of the Recovery Manager changes plus fixed problem with detecting snmp headers CVS patchset: 1731 CVS date: 2003/03/19 19:23:01 --HG-- extra : convert_revision : a04b96d07c54f2dad96694ad3ed79b0e11b26df0 2003-03-18 lars Patch by Adam Li : - man page updated for apphbd - Sample apphbd configure file: doc/apphbd.cf - Updated telecom/apphbd/apphbtest.c to make the output more understandable. New test cases added. However, the new test cases are not enabled by default (yet). - Updated configure.in and other a few Makefile.am to enable compile of apphbd plugin recmgr and recovery manager. - Bug fixes for recovery manager: - recoverymgrd cannot read more than one recovery script from configuration file. - recoverymgrd cannot set euid and egid for recovery scripts according to config file; order of setuid/setgid corrected. - Make sure that the specified user and group exists in the system by using getpwnam() and getgrnam() - Read configuration from file instead of using stdin in recoverymgrd CVS patchset: 1712 CVS date: 2003/03/18 11:36:25 --HG-- extra : convert_revision : 699ceff20fd3081c67d0c3a45b334f5f2dac67bd 2003-03-17 horms apcmastersnmp should compile atainst net-snmp or ucd-snmp CVS patchset: 1711 CVS date: 2003/03/17 20:34:50 --HG-- extra : convert_revision : 1084711e4e3ffa81408b30417e1535c9eb4cff53 2003-03-12 lars Moving CVS head to 1.1.1 release. CVS patchset: 1694 CVS date: 2003/03/12 18:22:29 --HG-- extra : convert_revision : 9ca56b0ce577e6346f25706621fc223365de8d1a 2003-03-07 alan Added the Delay resource (for testing) to configure.in CVS patchset: 1687 CVS date: 2003/03/07 01:20:16 --HG-- extra : convert_revision : b7486713ab6a8f5e039af7818315633c8c4dbfeb 2003-02-17 alan Changed version number to 1.0.1 CVS patchset: 1669 CVS date: 2003/02/17 21:00:12 --HG-- extra : convert_revision : 62e93212f3541363cae0daaea59fe7bc24464b2f 2003-02-16 horms Apcmaster SNMP module should now compile with ucd-snmp or net-snmp. Martin Bene. CVS patchset: 1663 CVS date: 2003/02/16 10:26:50 --HG-- extra : convert_revision : 5cf05fd61d971ad68dec473df5f33cc4c695706d 2003-02-11 yixiong doc/Makefile.am: changed installation directory to the same as the RPM installation. heartbeat.spec.in: added all the missing files. configure.in: added checks for snmp_subagent for the heartbeat.spec.in file and misc others. CVS patchset: 1640 CVS date: 2003/02/11 21:55:04 --HG-- extra : convert_revision : e8fe8a75664fb8e0aeff4688ffd79e2f570a0d53 2003-02-11 msoffen Fixed problem with not defined variables. CVS patchset: 1639 CVS date: 2003/02/11 17:50:14 --HG-- extra : convert_revision : d2708a4bb0b4fad57f0e44db6d54b8e827a5bde2 Changes to get Heartbeat to compile on FreeBSD at all. Warnings cleaned up too. CVS patchset: 1637 CVS date: 2003/02/11 15:16:08 --HG-- extra : convert_revision : 2af04131137dff7a07b8eed851b412fe2e2b6d29 2003-02-11 horms fixed ha_config.h in configure.in CVS patchset: 1635 CVS date: 2003/02/11 01:02:44 --HG-- extra : convert_revision : eb53b9bc1ceff4ee1ceaf5a9082cc196fc2db40b 2003-02-10 horms oops, that is the second time today CVS patchset: 1630 CVS date: 2003/02/10 06:25:32 --HG-- extra : convert_revision : 73e7064d9b150cba0b5e25e1bcc8702b8bd8a50f moved ha_config.h.in back into include/ with the other headers that get installed CVS patchset: 1629 CVS date: 2003/02/10 06:24:34 --HG-- extra : convert_revision : 99299765ab109785d37b50b4c67c30ca7dbdb89f not that one CVS patchset: 1627 CVS date: 2003/02/10 03:57:59 --HG-- extra : convert_revision : f1f7b0255484423734fc12815f1af104fd552046 very minor update for 0.4.9g CVS patchset: 1626 CVS date: 2003/02/10 03:57:08 --HG-- extra : convert_revision : 7d55d536c5a2b519d4eecf5d91e3d156756251a1 2003-02-08 alan Upped the version number. CVS patchset: 1623 CVS date: 2003/02/08 15:29:50 --HG-- extra : convert_revision : f2d2a92bc8b635dc3330c57034bf16fa448f9306 2003-02-06 horms Generate ha_config.h using AM_CONFIG_HEADER such that #ifdef / #ifndef will behave properly in portability.h Pervious this was done by somewhat dubious means that relied on #if var / #if ! var working. Sorry about that. N.B: ha_config.h is intended as a minimalist version of config.h that can be installed under /usr/include (or elsewhere) without causing name-space polution. It is needed so portability.h can be installed under /usr/include (or elsewhere). CVS patchset: 1620 CVS date: 2003/02/06 04:12:05 --HG-- extra : convert_revision : 569a3947bd6157db52ec55992e232a0dab4c2ece 2003-02-05 alan Put in patch from Yixiong Zou to make things compile even if the new SNMP subagent isn't compiled in. CVS patchset: 1609 CVS date: 2003/02/05 14:52:58 --HG-- extra : convert_revision : 7a4a548feaf99c8d48f5221b5606a132ed8b3497 2003-02-05 horms Lars put a lot of work into making sure that portability.h is included first, everywhere. However this broke a few things when building against heartbeat headers that have been installed (usually somewhere under /usr/include or /usr/local/include). This patch should resolve this problem without undoing all of Lars's hard work. As an asside: I think that portability.h is a virus that has infected all of heartbeat's code and now must also infect all code that builds against heartbeat. I wish that it didn't need to be included all over the place. Especially in headers to be installed on the system. However, I respect Lars's opinion that this is the best way to resolve some weird build problems in the current tree. CVS patchset: 1604 CVS date: 2003/02/05 09:06:32 --HG-- extra : convert_revision : db5e920dde4d04c27c7609b8649e8241a784ae54 2003-01-31 msoffen Added Makefile so that the config directory gets included in the tarball archive. CVS patchset: 1588 CVS date: 2003/01/31 17:35:05 --HG-- extra : convert_revision : e7e1f75c2fbc8663972139a6cbbc66a0d03e4a9e 2003-01-30 yixiong Resolve the snmp_subagent configure problem. Thanks Matt for giving the solution which is compatible in both autoconf 2.13 and 2.53. Basically, use sinclude() instead of acinclude.m4. So I am backing out the original patches to autoconf-2.53.diff and bootstrap. CVS patchset: 1582 CVS date: 2003/01/30 17:31:20 --HG-- extra : convert_revision : 658a4f1d34244c85b5c0d2d1fbf5bdeac1750dd3 2003-01-24 yixiong Fix for "sigignore() undefined" warning on linux. AC_CHECK_FUNCS(sigignore) is moved after CFLAGS is set. CVS patchset: 1577 CVS date: 2003/01/24 18:52:29 --HG-- extra : convert_revision : fc29c4b9349f253a51c1fe3fb7bafd1e124c5b45 2003-01-23 yixiong Changed configure.in so that the "--enable-snmp-subagent" can be used on system that is running autoconf-2.13 as well. There should be only one AC_OUTPUT() statement in the configure.in file. Otherwise none of the Makefile will be subsitituted correctly. CVS patchset: 1576 CVS date: 2003/01/23 22:51:38 --HG-- extra : convert_revision : b95885c4397ec3f926b6ad2a3c549c4588a7adfb A partial fix to make snmp-subagent work on systems that are still running autoconf-2.13. Background: autoconf-2.13 does not support m4_include. Only acinclude.m4 can be used. To read more about this: http://sources.redhat.com/autobook/autobook/autobook_222.html#SEC222 Changes: 1) Modified autoconf-2.53.diff so it will remove the m4_include in configure.in 2) Modified bootstrap to copy config/snmp_subagent.m4 to acinclude.m4 3) Modified configure.in so that the patch can be applied correctly even when it is moved around. Bugs: On systems that is using autoconf 2.13, running "./ConfigureMe configure --enable-snmp-subagent" will pass. But the Makefile will not be subsitituted correctly. So doing a "make" will fail. I am still working on this bug. But at least this will not privent people from their normal lives. CVS patchset: 1575 CVS date: 2003/01/23 21:59:59 --HG-- extra : convert_revision : 0ac3d9bd4d6a7c282243a228cd08c91c947196dd 2003-01-23 msoffen Fixed so that enabling snmp works properly. CVS patchset: 1574 CVS date: 2003/01/23 15:32:53 --HG-- extra : convert_revision : 64924e009683c02e57a99cca2cc9179b020cabbc 2003-01-22 yixiong Enable the snmp subagent in the configure script. This should not have a big impact unless you are running the configure script with "--enable-snmp-subagent". Details: 1) Added file config/snmp_subagent.m4 to detect the ucd-snmp/net-snmp library dependencies. 2) configure.in with --enable-snmp-subagent switch 3) Changed hasubagent.c and Makefile.am so it will compile with both ucd-snmp and net-snmp. CVS patchset: 1573 CVS date: 2003/01/22 19:49:23 --HG-- extra : convert_revision : 05412a868f4011952cac817bd37edbef5c991a83 2003-01-17 msoffen Changed so that Werror isn't default on Solaris (several signal based problems). Cleaned some of the replacement code. CVS patchset: 1572 CVS date: 2003/01/17 08:42:55 --HG-- extra : convert_revision : f8d5c1cec8b4e52a34a88c5cad63341b31136ebf 2003-01-14 msoffen Added AC_CHECK_SIZEOF(long long). CVS patchset: 1561 CVS date: 2003/01/14 11:11:19 --HG-- extra : convert_revision : 031bae02b8750034d8d69fab2095a8f87987fad0 2003-01-13 yixiong Changed ConfigureMe so that you can pass "configure" options through. For example: ./ConfigureMe configure --enable-snmp-subagent Changed configure.in to enable the status for snmp subagent. CVS patchset: 1558 CVS date: 2003/01/13 19:05:10 --HG-- extra : convert_revision : 823c3905c4b3d7b13bdb53c3fc76c4560361b31a 2003-01-08 msoffen Changes to allow the --enable/--disable features to work properly. Defaulted --enable-fatal-error to on (Works properly with FreeBSD/Solaris). CVS patchset: 1556 CVS date: 2003/01/08 21:22:16 --HG-- extra : convert_revision : 03aa581ff563572fa831b1ffdb85b328287a9054 2002-12-18 horms AC_ARG_VAR doesn't seem to work with older versions of autoconf, and it is only advised in newer versions. So leave it out for now CVS patchset: 1546 CVS date: 2002/12/18 02:35:43 --HG-- extra : convert_revision : 92828e711179fea1cdca386099571d0b7f5dd806 2002-12-13 horms snmp subagent should now compile with net-snmp CVS patchset: 1538 CVS date: 2002/12/13 00:04:22 --HG-- extra : convert_revision : d1a05f69324c0d3d0f6fb65fea4658493d3a70b3 2002-12-12 horms Tidy up the output of configure --help CVS patchset: 1537 CVS date: 2002/12/12 22:56:05 --HG-- extra : convert_revision : e401a80d1ce6c8ab978f51f22a40317a26f85fbf configure cleanup. Dominik Vogt CVS patchset: 1535 CVS date: 2002/12/12 14:23:55 --HG-- extra : convert_revision : a5b7a7c133bbbfe026ffc720104168180b99cc1e 2002-12-03 horms fixed snmp_agent build problem. Yixiong Zou CVS patchset: 1530 CVS date: 2002/12/03 01:08:11 --HG-- extra : convert_revision : 747d3fa8000d8516ee592ae7e7f175484931e4ec 2002-11-28 alan We had a problem with local status updates getting all hosed sometimes (depending on timing). This greatly simplifies the management of local status, and even takes a field out of the heartbeat packet. A fix like this was suggested by Horms. CVS patchset: 1521 CVS date: 2002/11/28 17:10:05 --HG-- extra : convert_revision : 0e89cc594409f0c2139f27338b10bdb84d1026a5 2002-11-21 alan Added a dummy version of the SNMP code from Yixiong Zou. CVS patchset: 1509 CVS date: 2002/11/21 05:14:43 --HG-- extra : convert_revision : 3c6b8ae2e6d1d3ca8c779e2d22eb9696b5925daf 2002-11-05 msoffen Added check for termio before includeing termio.h CVS patchset: 1498 CVS date: 2002/11/05 19:32:54 --HG-- extra : convert_revision : d5570982e15e412f102d061fa031280e5321767c 2002-11-04 alan Added IPsrcaddr resource script to CVS. Coincidentally, picked up code to look for the strerror function. CVS patchset: 1491 CVS date: 2002/11/04 17:27:32 --HG-- extra : convert_revision : 23b12827908f4453339b8c63a0ba01ece9ced40d 2002-10-28 alan Fixing the nfds_t problem that Matt introduced, also getting rid of the brk() call test (it was all wrongheaded). CVS patchset: 1469 CVS date: 2002/10/28 17:15:00 --HG-- extra : convert_revision : 4f858a54d0919fb0c4b6d86a83704c23c3f17afd 2002-10-25 msoffen Created SYS_INCLUDE for the "system" specific includes - /usr/local/include. Mostly used by Solaris and FreeBSD. CVS patchset: 1466 CVS date: 2002/10/25 20:04:22 --HG-- extra : convert_revision : ed641749e44cb449691b81ad1e5e956282ce048e 2002-10-23 horms Provide configure options to turn on and of fatal and traditional warnings. I think these used to be in here. CVS patchset: 1461 CVS date: 2002/10/23 07:26:53 --HG-- extra : convert_revision : 9668a18acbd7e087de9c03dd7c8678785a047ee5 2002-10-21 horms hb api clients may now be built outside of the heartbeat tree CVS patchset: 1440 CVS date: 2002/10/21 10:17:17 --HG-- extra : convert_revision : b9f41c9de9119ba4a759395ce9e7efa8e1554928 2002-10-18 alan Updated the version to 0.4.9f CVS patchset: 1433 CVS date: 2002/10/18 12:33:24 --HG-- extra : convert_revision : 1e9a701ab84975eff27c7a48d0ec67d060e4fb25 2002-10-11 alan Got rid of some extra lines in configure.in that have probably been causing various kinds of warnings. CVS patchset: 1416 CVS date: 2002/10/11 00:57:12 --HG-- extra : convert_revision : 261fda22de4b6c928421908f57bfdaf7ef30d7bd 2002-10-10 msoffen Minor fix for LIBRT . CVS patchset: 1414 CVS date: 2002/10/10 14:39:38 --HG-- extra : convert_revision : 97ee117220b58a768149af53b4d68b519802953f 2002-10-10 horms start stop has moved from heartbeat/resource.d/ to doc/ CVS patchset: 1412 CVS date: 2002/10/10 00:50:25 --HG-- extra : convert_revision : a83657a6baf4e06a368add42208e23e0772161a3 2002-10-08 alan An attempt to fix Matt's problem which appears to be the result of dropping privileges incorrectly. CVS patchset: 1405 CVS date: 2002/10/08 03:40:37 --HG-- extra : convert_revision : 6596547f06b4168f912c3352b55c843de10d3693 2002-10-07 alan Put in changes to mark the diff-ed section of the configure.in file be marked clearly so people don't mess with it. CVS patchset: 1401 CVS date: 2002/10/07 21:42:42 --HG-- extra : convert_revision : 626126b414481f9ad42eed40bf20e863ce983c07 2002-10-01 alan Moved oc_event.h to the OCF directory and added oc_membership.h CVS patchset: 1383 CVS date: 2002/10/01 20:03:09 --HG-- extra : convert_revision : cfeb1814d54bc2429267bdf60a6d99c204f97d54 2002-09-30 msoffen I had LIBRT check backasswards. CVS patchset: 1381 CVS date: 2002/09/30 17:53:12 --HG-- extra : convert_revision : 8d070d43a35f127a5bd7fac3d688f98724f897bf Created intermediate LIBRT to link LIBRT with (since there IS no librt.so in freebsd. It checks, if HAVE_LIBRT is set, then it sets LIBRT to -lrt, else it uses nothing. CVS patchset: 1378 CVS date: 2002/09/30 07:07:24 --HG-- extra : convert_revision : a0a53919c43dda14f752e79ac49490a66e9db43d 2002-09-26 horms This seems to work better on my system. It also more closely matches the messages displayed by other checks CVS patchset: 1370 CVS date: 2002/09/26 03:50:20 --HG-- extra : convert_revision : 5766cafeb684926f5e6e938fa7384c6db6c28d72 2002-09-26 alan Made the 2.53 patch work again. Added code for preallocating a little memory before locking ourselves into memory. Added a new poll routine to be used in place of the system poll routine (for the cases we can do this). Changed apphbd to use it, and added code to allow heartbeat to use it (requires 2.5 kernel). Minor realtime fixes: use write instead of fwrite Don't open FIFO to client for each msg to them. Fixed a bug in apphbd where it complain and loop when clients disconnected unexpectedly. Added apphbd change to allow clients to specify warn times. Changed "client disconnected without telling us" from an error to a warning CVS patchset: 1369 CVS date: 2002/09/26 03:28:46 --HG-- extra : convert_revision : f57a7bb2c60c0c0cbe121e6ee86793635dbf0955 2002-09-24 msoffen Fixed to remove warnings on FreeBSD ( Also eliminated need for acconfig.h ). CVS patchset: 1365 CVS date: 2002/09/24 17:11:17 --HG-- extra : convert_revision : 0cace9a238123c5a6c7904f099795dc9c3a449ed 2002-09-17 alan Put in dependencies on ping in configure file and heartbeat spec file. CVS patchset: 1353 CVS date: 2002/09/17 15:21:38 --HG-- extra : convert_revision : 13750f0e31b9dbef6d0da1ecb6c7c7b36c3fc253 2002-09-13 msoffen Modified for Solaris by includeing Strings.h (Do NOT put into portability.h) because on linux strings.h should not be used WITH string.h . CVS patchset: 1342 CVS date: 2002/09/13 14:37:26 --HG-- extra : convert_revision : 0416ddd2f307fa1bdaa535198240dc4c80c9a5d7 2002-09-11 alan Put in a patch to make configure work correctly on slackware. CVS patchset: 1325 CVS date: 2002/09/11 17:43:41 --HG-- extra : convert_revision : ff5806d704671ae07438a903bb76e02cb53f247b 2002-09-10 alan Put in a test to see if net/ethernet.h is present on the build system. Also used this check in the get_hw_addr code. CVS patchset: 1316 CVS date: 2002/09/10 21:59:01 --HG-- extra : convert_revision : a0d660d5b77b2e3289d99d28cc3f5777c6269ece Added code, modified code to move to a common set of logging functions - cl_log() and friends. CVS patchset: 1315 CVS date: 2002/09/10 21:50:06 --HG-- extra : convert_revision : d3fa49036cb3a17f638df03e7162b8b1eaa0f662 2002-09-10 msoffen Adding the Alphasort function for future use. We may not need it but it shouldn't hurt anything having it defined. CVS patchset: 1308 CVS date: 2002/09/10 16:08:10 --HG-- extra : convert_revision : 9922428a79f141457ce3e9f5a614002c06bbb199 2002-09-10 alan Put in some build patches from Kevin Dwyer for OpenBSD. It isn't everything he gave me, but one of the fixes breaks Linux. CVS patchset: 1302 CVS date: 2002/09/10 03:31:52 --HG-- extra : convert_revision : 966d6cf3e36a70f10ac86389acff2971fd2652e6 2002-09-09 alan Put in a change which makes the heartbeat Makefile work correctly with some particular toolset versions for the old libnet for send_arp and get_hw_addr(). CVS patchset: 1300 CVS date: 2002/09/09 21:00:28 --HG-- extra : convert_revision : 9f07122648d070dffa48957ee62470c817cd9022 Put in some code to automatically apply the autoconf patch for version 2.53 when needed - or unapply it as appropriate. Also applied a patch from Nathan Wallwork for another autoconf/make version compatibility problem. CVS patchset: 1296 CVS date: 2002/09/09 16:44:57 --HG-- extra : convert_revision : 973521615d307ef4416d29e68470314316ceb73a Added Kevin Dwyer's ipfail program to the make process. CVS patchset: 1295 CVS date: 2002/09/09 13:10:28 --HG-- extra : convert_revision : 71dca512392be9d97ac5f02ea397f2e1d2e7e367 2002-09-05 alan Changed the autoconf stuff to deal with the fact that the new libnet doesn't have libnet-config. CVS patchset: 1287 CVS date: 2002/09/05 07:04:50 --HG-- extra : convert_revision : 97a61a75af0e4f2bce375f8d2c5d905f81f9374c Put code into configure.in to detect which version of the libnet API is present in the installed libnet library. CVS patchset: 1283 CVS date: 2002/09/05 04:15:08 --HG-- extra : convert_revision : 984924c26ee870c9c64b50d7c3b32ea8d545cc2f 2002-08-20 alan Changed the STONITH code to be based on the new plugin system. CVS patchset: 1258 CVS date: 2002/08/20 16:12:46 --HG-- extra : convert_revision : 386a9f2110fbdd94dd8442808db63045a11836e9 2002-08-20 msoffen Added -A opt for IFCONFIG_A_OPT for OpenBSD (-A shows Alias addresses, -a doesn't) CVS patchset: 1255 CVS date: 2002/08/20 15:46:17 --HG-- extra : convert_revision : bb1bc1131b1e88d29b2f6b6bb9a333c6de7c79ba 2002-08-18 alan Made PILS configuration more flexible. Changed to a PATH for the plugin directories. Moved the default PILS plugins to /usr/lib/pils/plugins. Included that directory by default. CVS patchset: 1248 CVS date: 2002/08/18 01:35:54 --HG-- extra : convert_revision : e71c7fdaf33db5fa27b486bc17ad83dfa7490872 2002-08-17 alan Added a script called BasicSanityCheck to check if a compiled version has basic sanity... CVS patchset: 1245 CVS date: 2002/08/17 06:39:32 --HG-- extra : convert_revision : 0b4493b3c832d8d6f27983fec73e39a14bb76753 2002-08-12 alan Substitute a grep script for getent when getent isn't available. CVS patchset: 1224 CVS date: 2002/08/12 14:37:14 --HG-- extra : convert_revision : d68a8c340b1cc05db7f214955f7dbd4e25555025 2002-08-10 alan Put in a ksh fix for the configure.in Make PILS into a subpackage. Install header files in different directories for PILS and STONITH. More IGNORESIG fixes. CVS patchset: 1222 CVS date: 2002/08/10 02:26:12 --HG-- extra : convert_revision : b66c85df940454d59214141309dce451b5eb5f85 2002-08-07 msoffen Cleaned up many warning messages from FreeBSD and Solaris. Cleaned up signal calls with SIG_IGN for Solaris to use sigignore function (to remove some warnings). CVS patchset: 1209 CVS date: 2002/08/07 18:20:32 --HG-- extra : convert_revision : de751cca0f738093022e2e26707ddfded526f4c2 2002-08-01 lars Courtesy of Ram Pai: - Compile fixes to ccm - configure.in and specfile scripts converted to use getent - Fix to ping.c by lmb (after he broke it) CVS patchset: 1195 CVS date: 2002/08/01 20:08:03 --HG-- extra : convert_revision : cedf0fbfd02c5dab6ae9585362d9053dd571f2c1 2002-07-31 alan Put in bug fix for bug discovered by Lmb and fixed by Ram. The APIGID was set instead of the CCMUID. CVS patchset: 1191 CVS date: 2002/07/31 15:42:58 --HG-- extra : convert_revision : 46b5aca822b7887c1469cdafac073543259d421b Changed the release number to '0.4.9d' in preparation for the next beta. CVS patchset: 1188 CVS date: 2002/07/31 12:29:00 --HG-- extra : convert_revision : 8c95cc550209c55f6f40dde3ac14c214d8d212bd Put in code to parameterize the name of the ucd-snmp development package and default it to one thing on SuSE, and something else for all other distros... CVS patchset: 1184 CVS date: 2002/07/31 02:23:27 --HG-- extra : convert_revision : bd9aaa9cd2e5a8564dcb60f08d2be69bc3b141f6 Put in Ram's changes which make his CCM code compile and install correctly. CVS patchset: 1181 CVS date: 2002/07/31 01:22:11 --HG-- extra : convert_revision : 58acdfb1196308b5b0cf2fa417cc5a4975b92d6c 2002-07-30 alan Added Ram's CCM membership code. It's untested as of now... CVS patchset: 1177 CVS date: 2002/07/30 15:54:07 --HG-- extra : convert_revision : 6f02c1106a697f7a81bae9f9c1549110a507e4d7 Updated the version to 0.4.9c CVS patchset: 1173 CVS date: 2002/07/30 11:49:09 --HG-- extra : convert_revision : 39ae63e8a5c5c39b8045a3e64b5b2d81cef31a90 2002-07-17 horms My bad. Apparently "export -n" isn't as universal as I expected. In paricular it fails on Solaris (I _love_ Solaris). The behaviour under Solaris' /bin/sh is that variables are not automatically reexported. Thus we test to see if CFLAGS is being exported and only run "export -n" in that case. This also avoids any warnings that "export -n" may throw if CFLAGS was not exported. Aditionally, I have added a || true, so that even if "export -n" is called it will not cause configure to bomb out. This is the same behaviour as if this code was not there at all. CVS patchset: 1163 CVS date: 2002/07/17 08:19:22 --HG-- extra : convert_revision : 72265415cefdfa43e911c055da795431dfc00156 2002-07-16 msoffen Added export CFLAGS before export -n CFLAGS (to make it not warn) CVS patchset: 1158 CVS date: 2002/07/16 17:02:38 --HG-- extra : convert_revision : f03f20cefcbc15a95285841455ffffe34fc93296 2002-07-16 horms I have finally worked out why the -Werror additions to CFLAGS were causing me such problems. This patch explicitly sets CFLAGS as unexported. This should not break anything, and keeps Alan's Policy of having -Werror on by default for Linux. CVS patchset: 1151 CVS date: 2002/07/16 12:31:27 --HG-- extra : convert_revision : d3c16bb0d82e8fc0db7888ff139252c52060b778 2002-07-16 alan Added a little bit to hopefully make OpenBSD work with the current build architecture. CVS patchset: 1146 CVS date: 2002/07/16 02:35:17 --HG-- extra : convert_revision : 038de97ae7b09c2bcab1b003749a76a02b552aa7 2002-07-15 msoffen Changed LIBOPTS line. 'Fix' breaks on every single version other than the newest Suse (United Linux) CVS patchset: 1145 CVS date: 2002/07/15 14:40:06 --HG-- extra : convert_revision : 00f41d005e4192688e4a84fdd6c48644f3a8b9ce 2002-07-10 lars Fix gcc deprecated warnings. -Wshadow had to be dropped from the CFLAGS, because the newer gcc defines "index" as a builtin, and gcc complains that the glib.h shadows this in numerous places because the shadow detection in gcc is rather simplistic. CVS patchset: 1142 CVS date: 2002/07/10 16:03:33 --HG-- extra : convert_revision : bad7debaa5a032eddcf97e5e25997b760bf23c1e Make autoconf happy again as per http://www.gnu.org/manual/autoconf/html_mono/autoconf.html#AC_LIBOBJ%20vs.%20LIBOBJS: The token LIBOBJS may not be referenced directly any longer, but must be quoted. CVS patchset: 1141 CVS date: 2002/07/10 15:17:22 --HG-- extra : convert_revision : f1de1f5dd66d3248b3917572cc6ce5dc9bbf8a74 2002-07-05 msoffen Changed to "remove" ipcsocket.c from building on Solaris. CVS patchset: 1137 CVS date: 2002/07/05 14:41:01 --HG-- extra : convert_revision : 949f8da0c052047e7a72fe1eca2a0b4387cb0f0c 2002-06-19 msoffen Changes to get ipcsocket.c to compile on BSD. CVS patchset: 1130 CVS date: 2002/06/19 04:44:52 --HG-- extra : convert_revision : d6c57f5a7371e5d0dacc4b746bf0a65f520ea093 2002-06-06 alan Fixed some problems running with newer versions of libtool and automake Added first draft of code for apphbd daemon Added beginnings of application heartbeating for heartbeat clients. CVS patchset: 1096 CVS date: 2002/06/06 06:10:03 --HG-- extra : convert_revision : 435cd602991e4e8a3ba32e2de85fda5a294a3167 2002-05-28 msoffen Changes to replace send_arp with a libnet based version. This works accross all operating systems we currently "support" (Linux, FreeBSD, Solaris). CVS patchset: 1078 CVS date: 2002/05/28 18:25:48 --HG-- extra : convert_revision : bf8a7c9a72f0d6e9eb99a78117a56c158d4119cf 2002-05-16 alan Added cts to the substitutable file list in configure.in CVS patchset: 1070 CVS date: 2002/05/16 13:29:28 --HG-- extra : convert_revision : 0eb5d47e4b1aad7cb24269878966e9ec7ab56ed8 2002-05-14 msoffen 1) Adding PING variable (for differences with FreeBSD and Linux). 2) Added cts/*.py so that they get created from the .in files. CVS patchset: 1064 CVS date: 2002/05/13 23:11:53 --HG-- extra : convert_revision : ff3529d968f879d1421fa5083fed63fd67330ad9 2002-04-19 alan Put in changes from Matt Soffen. CVS patchset: 1045 CVS date: 2002/04/19 21:49:32 --HG-- extra : convert_revision : eb781df760df300a15fd4a68446e6a378024d8b4 2002-04-16 lars Added auto-detection location of modprobe, fuser, mount, umount, raidstart, raidstop at compile time. CVS patchset: 1040 CVS date: 2002/04/16 13:39:12 --HG-- extra : convert_revision : e7a8d023082dec38a44f832340e09a3f6a47bea3 2002-04-15 lars - Added RA for ICP Vortex Cluster controllers. - Modified Filesystem to recognize XFS as a journaled filesystem. CVS patchset: 1036 CVS date: 2002/04/15 16:01:25 --HG-- extra : convert_revision : 63d0250d5268705ce7dd44899a886e9f02326384 2002-04-04 alan Put in a whole bunch of new code to manage processes much more generally, and powerfully. It fixes two important bugs: STONITH wasn't waited on before we took over resources. And, we didn't stop our takeover processes before we started to shut down. CVS patchset: 1008 CVS date: 2002/04/04 17:55:27 --HG-- extra : convert_revision : 7c6e52a2da086e3ea426367ec3bf8c3bee92fa70 2002-04-03 alan Made the init starting and stopping priorities into autoconf variables. They default to 75 and 5 respectively. They should probably be overridden for SuSE in the ConfigureMe script. CVS patchset: 1005 CVS date: 2002/04/03 20:02:21 --HG-- extra : convert_revision : 214058f3fd225afd80c2cc82197f0e28f59c41f8 2002-04-02 alan Failover was completely broken because of a typo in the configure.in file Changed the run level priorities so that heartbeat starts after drbd by default. Changed it so that heartbeat by default runs in init level 5 too... Fixed a problem which happened when both nodes started about simultaneously. The result was that hb_standby wouldn't work afterwards. Raised the debug level of some reasonably verbose messages so that you can turn on debug 1 and not be flooded with log messages. Changed the code so that in the case of nice_failback there is no waiting for the other side to give up resources, because we negotiate this in advance. It gets this information through and environment variable. CVS patchset: 1004 CVS date: 2002/04/02 19:40:36 --HG-- extra : convert_revision : 9e592bb29d3804989f12e43c8199231ffbf41080 2002-03-22 alan Added an apache resource script. It also supports the IBMHTTPserver, if you link it to IBMhttp or something like that. CVS patchset: 995 CVS date: 2002/03/22 05:50:52 --HG-- extra : convert_revision : 4c5641a9d3477859abfad6a645adea8af2b0e3fa 2002-03-15 alan Added an LVM resource script. CVS patchset: 990 CVS date: 2002/03/15 15:39:28 --HG-- extra : convert_revision : 2fd91d910c0149fef908906c2cc674164945e9a7 2002-03-11 alan Added the WAS resource script to the configure.in file and the resource.d Makefile.am. CVS patchset: 978 CVS date: 2002/03/11 22:25:24 --HG-- extra : convert_revision : 6f3dcd71425e84b23a7c6738d334c741a7b919fc 2002-03-09 alan Added several new resource scripts. CVS patchset: 975 CVS date: 2002/03/09 13:47:22 --HG-- extra : convert_revision : e4db46f81438c848464f707ffcc43a9e4fac360a 2002-03-05 alan Put in a few "" strings around arguments to test. CVS patchset: 969 CVS date: 2002/03/05 21:19:27 --HG-- extra : convert_revision : 82a676df268c13a0d369ee80ca9d3c7721aabdc8 2002-02-12 alan Put in code to filter out rc script execution on every possible message, so that only those scripts that actually exist will we attempt to execute. CVS patchset: 957 CVS date: 2002/02/12 15:22:28 --HG-- extra : convert_revision : 9ebb26cd586f576b5e0509e46f96e903e86f68a2 2002-02-11 alan Changed the configure.in so that if the API group already exists, then it will use whatever the API gid is, and not use APIGID. CVS patchset: 952 CVS date: 2002/02/11 03:25:08 --HG-- extra : convert_revision : ae574ce6e30e61f58c7b5f34bbf6c50ed5e5dade Added a little initial code to support starting client programs when we start, and shutting them down when we stop. CVS patchset: 951 CVS date: 2002/02/10 23:09:25 --HG-- extra : convert_revision : bce3efba29b261c2c3cf2832117611d8303ca648 2002-01-22 alan Put in change to accommodate Solaris /var/spool/locks directory. Suggested by Matt Soffen CVS patchset: 939 CVS date: 2002/01/22 12:19:38 --HG-- extra : convert_revision : da35f6fc5ba0d32f7df9feb97abf135b0380edc8 2002-01-17 alan Get rid of a warning... CVS patchset: 931 CVS date: 2002/01/17 15:13:16 --HG-- extra : convert_revision : 1b1815ca95a72c5ac1be5e9b42813cf6704edb26 2001-12-18 alan Moved the AM_PROG_LIBTOOL up earlier in the configure.in file in hopes of solving a problem someone is having with a different (later?) version of libtool and/or automake. CVS patchset: 921 CVS date: 2001/12/18 20:07:23 --HG-- extra : convert_revision : a783dcbcd9029bb164e070e926c93b6f79cb56c4 2001-12-12 alan Had to eliminate the -Wnested-externs flags because of stupid libtool changes. CVS patchset: 916 CVS date: 2001/12/12 11:12:45 --HG-- extra : convert_revision : fe5bd2a25050b01e149a6b93cc2fc1a481ad3c23 2001-11-26 horms Ooops, this should not have been commited. (Horms) CVS patchset: 915 CVS date: 2001/11/26 15:21:32 --HG-- extra : convert_revision : 15579e0c681906605e444dc0ba7313cbbaa9c246 Added BuildPrereq: glib-devel. (Horms) CVS patchset: 914 CVS date: 2001/11/26 15:19:50 --HG-- extra : convert_revision : 721f91585da86d9d6c172d3c4f60e35ac21f9f90 2001-11-23 alan Updated configure.in to work with newer versions of automake/autoconf. CVS patchset: 910 CVS date: 2001/11/23 21:39:14 --HG-- extra : convert_revision : b34f62c08e05afddc2589b50a64002fe59fbb603 2001-10-25 alan Changed the code to auto-configure the location to install System V style RC scripts into -- which is also where we'll run init scripts from, if they're there... CVS patchset: 883 CVS date: 2001/10/25 14:21:40 --HG-- extra : convert_revision : 5fb3e799af1aba93a001a95bfe2fe61b4a9ffff7 2001-10-24 alan A large number of patches. They are in these categories: Fixes from Matt Soffen Fixes to test environment things - including changing some ERRORs to WARNings and vice versa. etc. CVS patchset: 880 CVS date: 2001/10/24 20:46:28 --HG-- extra : convert_revision : aa5a8b642da56d75d7817244cdb042cb413c5f34 2001-10-13 alan Put in a fix to the findif command which makes it test which way to get routing information at run time rather than at compile time. This permits configure to be run as a normal user on environments where /proc/route isn't available. CVS patchset: 863 CVS date: 2001/10/13 21:03:12 --HG-- extra : convert_revision : d6840fde1c7de95324801c33631628e6bfd59aa8 Added Luis' patch for providing the standby capability CVS patchset: 858 CVS date: 2001/10/12 22:38:06 --HG-- extra : convert_revision : 1afe5e70d84793ed9c9e2fa9db24180ab9ecf5f5 2001-10-12 alan Put in a patch to fix the configuration of tty lock directories -- both in the source and in the configure.in file. CVS patchset: 856 CVS date: 2001/10/12 19:29:38 --HG-- extra : convert_revision : 8d7a3ea94c73dcee5044b7c7fa55b77ee84d127f Patch from David Lee for making the location of libintl work on Solaris. CVS patchset: 853 CVS date: 2001/10/12 15:37:47 --HG-- extra : convert_revision : 39a0202c421bbb89048a06bcd208b77485fcbc5a 2001-10-10 alan Corrected syntax error in configure.in CVS patchset: 850 CVS date: 2001/10/10 15:03:03 --HG-- extra : convert_revision : 0cf7078687afa39a0f4eaeb0fa9cf70aa3bec420 Added a comment about the previous change. CVS patchset: 849 CVS date: 2001/10/10 15:01:11 --HG-- extra : convert_revision : 6261cdea7c4dad70c1c6e26882b49c9e8dd2ccfe Added a check to test for /usr/local/lib for the -lintl library (for FreeBSD) CVS patchset: 848 CVS date: 2001/10/10 15:00:19 --HG-- extra : convert_revision : 9415a7f109b79366c40551959190f2535f6aec2d 2001-10-09 alan Incorporated a fix from Matt Soffen on locating dlopen. CVS patchset: 845 CVS date: 2001/10/09 13:38:30 --HG-- extra : convert_revision : c0a98223cb90d9ddbbff2b199865406b43fa0457 2001-10-07 alan Added some more portability patches from David Lee. CVS patchset: 837 CVS date: 2001/10/07 04:18:20 --HG-- extra : convert_revision : 59d4633286bca00874bd693f9dfaba39d86eb3ad Fixed up the 'echo' code in rc script so that it's portable. The 'shellfuncs' function library now have Echo EchoEsc and EchoNoNl functions in it. They are supposed to work on any OS. CVS patchset: 835 CVS date: 2001/10/07 03:58:10 --HG-- extra : convert_revision : 8e5cb256ebe62d4aeb67fa22f1f0ada07566c8e2 2001-10-06 alan Straightened out how the libintl code is selected and linked. Did similar things for the libdl code. CVS patchset: 829 CVS date: 2001/10/05 22:12:45 --HG-- extra : convert_revision : c7fa97222b1053dfc0b12483a8b54c222c7acbe8 2001-10-04 alan Patch from Reza Arbab to make it compile correctly on AIX. CVS patchset: 827 CVS date: 2001/10/04 21:14:30 --HG-- extra : convert_revision : 4c145c4f658ac9df4bd566a00cf2d86118865217 2001-10-03 alan Added a couple of patches from Matt Soffen: Make a debug statement conditional ;-) Fix configure so it does things correctly on FreeBSD and Solaris findif.c configuration parameters. CVS patchset: 811 CVS date: 2001/10/03 05:45:56 --HG-- extra : convert_revision : 425a692103a864ed0a1c7d1490923a1fb4576a40 2001-10-02 alan Changed configure.in so that it autodetects vacm and snmp capabilities. Must have been in the wrong directory when I did the last commit... Sigh... CVS patchset: 801 CVS date: 2001/10/02 14:07:11 --HG-- extra : convert_revision : 2a0ab18c4dc44b2edda5e496a4808c1e8698d556 Added a check for stdarg.h -- this is for a header file we include. CVS patchset: 796 CVS date: 2001/10/02 05:09:08 --HG-- extra : convert_revision : f5487e7b7e9392ac4f8878e81265ea4c1a63a710 2001-09-28 alan Hopefully fixed the PATH stuff so it's right... I appear to have fixed the variable substitution issues in the specfile. CVS patchset: 782 CVS date: 2001/09/28 20:13:21 --HG-- extra : convert_revision : 5e4b01e0d8bbd9be5d653cb710b84f074d48588c 2001-09-27 alan Shortened alarm time in write in serial.c Put in a handful of Solaris warning-elimination patches. CVS patchset: 781 CVS date: 2001/09/27 17:02:34 --HG-- extra : convert_revision : 6bca5ff5feca2d11e68d4a31777c3fbb21fc26d9 Made change suggested by David Lee on SYSPATH setting... CVS patchset: 779 CVS date: 2001/09/26 22:22:47 --HG-- extra : convert_revision : 97abc3c7b522bc82b24070e9e266dbf2ca29831a 2001-09-22 alan Added more things to make the RPMs just flow nicely... CVS patchset: 776 CVS date: 2001/09/22 19:47:04 --HG-- extra : convert_revision : ee26c7f993cab9e0644ac526677166b2046974c6 Changed the way RPMs are built so that whatever options you specify in configure will propagate to the RPM, and the RPM will do the same configure you did in the first place rather than hard-coding paths to certain places where Red Hat and SuSE like to put them. Next I need to actually add a "make rpm" target... CVS patchset: 775 CVS date: 2001/09/22 08:19:28 --HG-- extra : convert_revision : 5ac7d548731a5678e7401639960636ba48449a47 2001-09-07 horms These new checks need to be before we play with CFLAGS. Horms CVS patchset: 770 CVS date: 2001/09/07 06:30:16 --HG-- extra : convert_revision : 999e01cd0fb83d6e8fc13f9d0c616fdb236fa104 2001-09-06 horms Added code to set proctitle for heartbeat processes. Working on why heartbeat doesn't restart itself properly. I'd send the latter as a patch to the list but it is rather intertwined in the former CVS patchset: 762 CVS date: 2001/09/06 16:14:35 --HG-- extra : convert_revision : 151bcb2f380d89c3f2383671b2139d167bab9bd3 2001-08-21 horms More OpenBSD stuff from Mat Soffen CVS patchset: 748 CVS date: 2001/08/21 03:59:11 --HG-- extra : convert_revision : cb5d6ceb470e974b2372417848a281c9d75825e8 * Modified stonith/Makefile.am so that it doesn't refer to expect.h or stonith.h which have both been moved to include/stonith/ * Removed stonith/expect.h as it has been moved to include/stonith/ * Added stonith/Makefile.am so expect.h and stonith.h will actuall appear in tar balls. * Modified configure.in and include/Makefile.am to reflect addition of stonith/Makefile.am CVS patchset: 747 CVS date: 2001/08/21 02:08:49 --HG-- extra : convert_revision : 2aa282e8e005bc8281e41e4162e59fb23759c511 2001-08-16 horms Display results of checks for CLK_TCK CVS patchset: 743 CVS date: 2001/08/16 12:04:33 --HG-- extra : convert_revision : 3874f5b43756d38ef323e0436f992f77d7cadaf0 2001-08-15 alan Put in patches by Sandro Poppi to enhance MailTo and add the WinPopup resource scripts. CVS patchset: 737 CVS date: 2001/08/15 04:56:12 --HG-- extra : convert_revision : da616dc6a5e2ae96e72e9cebf4e9a02aae13d6f1 Improved the code for installing the FIFOs so they have the right group affiliation. CVS patchset: 736 CVS date: 2001/08/15 04:37:02 --HG-- extra : convert_revision : 0e1147d161832b261017b9ad9e46938fb5333a3d Patch from Ragnar Kjørstad : * Moves ROUTE, NETSTAT and IFCONFIG to config.h * Uses AC_PROG_PATH macro rather than locatecmd() function. Was there any reason why locatecmd was introduced in the first place? The cvs log doesn't state any reason... * Makes USE_ROUTE_GET behave correctly. The old code _always_ defined this macro, so when it's tested with #ifdef USE_ROUTE_GET in heartbeat/findif.c it always succeeds - thus making heartbeat use "route -get" instead of /proc * Adds a few missing files to heartbeat.spec * Disables RPM_OPT_FLAGS from beeing passed to configure because -Wall -Werror makes autoconf fail. (it failes because autoconf uses "main() {return 0;}" to test if you have a working C-compiler. It should have used "int main(void) {return 0;}". We realle aught to use RPM_OPT_FLAGS, but I don't see how we can make that work properly without fixing autoconf and make everybody upgrade? * Add rc.d/stonith to the Makefile. CVS patchset: 735 CVS date: 2001/08/15 02:46:37 --HG-- extra : convert_revision : 22b6215da2a6fba12388f451ba65e966617a9868 2001-08-14 horms make distcheck should work now, -Werror problems not withstanding CVS patchset: 724 CVS date: 2001/08/14 03:25:07 --HG-- extra : convert_revision : c4f07f6fe9eb3568f14c98f8331ea7feb5a8bfdc 2001-08-13 horms Incoporated Matthew Soffen's patch to allow glib checks to work on FreeBSD CVS patchset: 721 CVS date: 2001/08/13 10:06:14 --HG-- extra : convert_revision : 5f491d96df3212024aa7b0c520ddb479c89714e3 2001-08-12 alan Finished the patch from Lorn Kay Also reverted a patch from Horms which turned off strict warnings on Linux. CVS patchset: 719 CVS date: 2001/08/11 22:59:25 --HG-- extra : convert_revision : d1419caa92ef4225b7ac6ffbfa713983b90ae120 2001-08-11 alan Converted comm software over to use new plugin loading system and also moved header files around to be a little more "standard". CVS patchset: 716 CVS date: 2001/08/11 00:50:43 --HG-- extra : convert_revision : 32c40af039d81afb6687acde97a760ee727c882d 2001-08-10 horms Don't set -Werror by default on Linux as on Debian Woody at least ./configure in libltdl/ will fail if this is set. For this reason a much more sane default is to leave it on and ask for it at configure time if you really want it. CVS patchset: 705 CVS date: 2001/08/10 04:21:08 --HG-- extra : convert_revision : 6fa3c688ce6a11215d9caaca70e96277ed8cc855 2001-07-20 alan Made a change I'm not very happy with... I changed the name of the package from linux-ha to heartbeat because some directory names were being created as linux-ha which is wrong... CVS patchset: 691 CVS date: 2001/07/20 05:39:02 --HG-- extra : convert_revision : bb7ae01c7945f935043467ebd2c6a4287a409c89 2001-07-19 alan Replaced a [ ... ] with the test command... CVS patchset: 681 CVS date: 2001/07/19 05:59:31 --HG-- extra : convert_revision : 9effe29cb7ee56bb3df6736b794040ac9823114f 2001-07-17 alan Slightly modified the Makfile.am and configure.in so that the necessary subdirectories get makefiles generated for them and get built correctly. CVS patchset: 673 CVS date: 2001/07/17 15:40:13 --HG-- extra : convert_revision : 2689ef51109ed390e07b3752d9e9cd05f8be138d Put in some changes to allow us to detect missing glib header files, etc. CVS patchset: 672 CVS date: 2001/07/17 15:11:14 --HG-- extra : convert_revision : 25158b761541ab89b57e54ec78e6146b252ca20b Put in Matt's changes for findif, and committed my changes for the new module loader. You now have to have glib. CVS patchset: 671 CVS date: 2001/07/17 15:00:04 --HG-- extra : convert_revision : 1005cbfe872cc7aa691a8f721f3cd98d34965692 2001-07-09 alan Put in some comments on how we configure things to look for glib headers etc. (also to test commit log emails). CVS patchset: 667 CVS date: 2001/07/09 03:12:00 --HG-- extra : convert_revision : b5e36c8d7012625b664a25ea3759bf6c5e4ff2d0 2001-07-06 alan Minor changes to support the new directory structure and plugin loading subsystem. CVS patchset: 655 CVS date: 2001/07/06 20:39:46 --HG-- extra : convert_revision : 4b21fb97bbc4a89c2dd2fde8fb186fa6dc0a2b39 Renamed the new module loading code directory... CVS patchset: 645 CVS date: 2001/07/06 01:55:39 --HG-- extra : convert_revision : 82116a2448961a800bf2130bfa3030792a92ee40 2001-06-29 alan Updated new module loading code. Strangely enough it looks like core pieces of it are working. CVS patchset: 627 CVS date: 2001/06/29 08:05:51 --HG-- extra : convert_revision : bca50da3e5dec3ac8442019067ea1e8ebc4fb9a4 2001-06-28 alan Patch from Juri to install our scripts with paths patched appropriately. CVS patchset: 623 CVS date: 2001/06/28 20:35:00 --HG-- extra : convert_revision : 525834078860ba807fab2d17940ca333ff3df849 2001-06-25 alan Minor changes to configure messages. CVS patchset: 603 CVS date: 2001/06/25 12:09:20 --HG-- extra : convert_revision : b01abff8928f4c3756ed8b8b8ac3e40d06e9a7a9 2001-06-23 alan Moved the new module loader code to the lib/upmls directory. Modified Makefile.am, configure.in, etc. to match... New code is now only optionally compiled... CVS patchset: 593 CVS date: 2001/06/23 15:52:06 --HG-- extra : convert_revision : afa8f4e1be7d97150dd81412c1977d59548b04c6 Changed CLOCKS_PER_SEC back into CLK_TCK. Quite a few places, and add portability stuff for it to portability.h CVS patchset: 591 CVS date: 2001/06/23 07:01:48 --HG-- extra : convert_revision : 0fbe6454c4c660ee7072890b54a52d961d72e571 Changed the code to use inet_pton() when it's available, and emulate it when it's not... Patch was from Chris Wright. CVS patchset: 590 CVS date: 2001/06/23 04:30:26 --HG-- extra : convert_revision : 999ad51a616ab95f10bc080fa797251ffbcf5a30 2001-06-19 alan Fixed something I thought I fixed last time... CVS patchset: 588 CVS date: 2001/06/19 19:03:31 --HG-- extra : convert_revision : 034d4b84f45b118085d6abfd83da30c1719e4b8a *** empty log message *** CVS patchset: 587 CVS date: 2001/06/19 15:25:28 --HG-- extra : convert_revision : 458ef0fb1cc7eb84861a71b8b81044606cac5f80 Put in code to support newer versions of RH glib install weirdnesses... CVS patchset: 583 CVS date: 2001/06/19 04:05:39 --HG-- extra : convert_revision : cbf48882438490fd6e869e75b295a59b68555bfb 2001-06-16 alan Made a couple of minor changes for FreeBSD (Matt Soffen) CVS patchset: 582 CVS date: 2001/06/16 12:54:53 --HG-- extra : convert_revision : 976794ddcbffdd59ba6679c053f233e320ad421a 2001-06-14 alan Put in a change which should make things find -lintl on *BSD systems. For Matt Soffen... CVS patchset: 579 CVS date: 2001/06/14 21:50:26 --HG-- extra : convert_revision : f18a8f3fb6c650a28f2e6796f4b82ec8d64cea76 2001-06-13 alan Added change to find certain include files in /usr/local for FreeBSD... CVS patchset: 578 CVS date: 2001/06/13 21:20:36 --HG-- extra : convert_revision : 1f648e73a6db81c34eb24cf94a8c88259e88329b 2001-06-07 alan Put in various portability changes to compile on Solaris w/o warnings. The symptoms came courtesy of David Lee. CVS patchset: 572 CVS date: 2001/06/07 21:29:44 --HG-- extra : convert_revision : 60254b845fbef540eef604a4e94c9d13f7ef8924 Made stonith.c check for libintl.h before including it... CVS patchset: 568 CVS date: 2001/06/06 22:21:11 --HG-- extra : convert_revision : 2d2d4666e739958057f719aa35dad315e935dcd1 2001-06-06 alan Put in a small change to help configure find dlopen, et al on FreeBSD CVS patchset: 565 CVS date: 2001/06/06 17:49:17 --HG-- extra : convert_revision : a5dd5b1b58faaa146cce0931a19938f71fef3be2 2001-06-05 alan Put in code to include libglib iff present. Also, turn on experimental code in module.c iff glib is available. CVS patchset: 564 CVS date: 2001/06/05 16:43:44 --HG-- extra : convert_revision : 8924169f0b24aca48ab293d4ee4a1273d3c5f6ae 2001-05-26 mmoerz *.cvsignore: added automake generated files that were formerly located in config/ * Makefile.am: removed ac_aux_dir stuff (for libtool) and added libltdl * configure.in: removed ac_aux_dir stuff (for libtool) and added libltdl as a convenience library * bootstrap: added libtools libltdl support * heartbeat/Makefile.am: added some headerfile to noinst_HEADERS * heartbeat/heartbeat.c: changed dlopen, dlclose to lt_dlopen, lt_dlclose * heartbeat/crc.c: changed to libltdl, exports functions via EXPORT() * heartbeat/mcast.c: changed to libltdl, exports functions via EXPORT() * heartbeat/md5.c: changed to libltdl, exports functions via EXPORT() * heartbeat/ping.c: changed to libltdl, exports functions via EXPORT() * heartbeat/serial.c: changed to libltdl, exports functions via EXPORT() * heartbeat/sha1.c: changed to libltdl, exports functions via EXPORT() * heartbeat/udp.c: changed to libltdl, exports functions via EXPORT() * heartbeat/hb_module.h: added EXPORT() Macro, changed to libtools function pointer * heartbeat/module.c: converted to libtool (dlopen/dlclose -> lt_dlopen/...) exchanged scandir with opendir, readdir. enhanced autoloading code so that only .la modules get loaded. CVS patchset: 549 CVS date: 2001/05/26 17:38:01 --HG-- extra : convert_revision : 6a0495566dc22e43352cc8f0dc380a7fa7722abf 2001-05-22 alan Added code from David Lee for porting the file locking semantics in the APC UPS code to other OSes. CVS patchset: 546 CVS date: 2001/05/22 13:00:30 --HG-- extra : convert_revision : 8901c1659826c86d02313b20455e9e8b31d5b320 2001-05-21 alan Fixed the spelling on the -Wtraditional flag in configure.in CVS patchset: 540 CVS date: 2001/05/21 15:00:28 --HG-- extra : convert_revision : ae29290e11aa5798997e76bab976b74d3b2de846 2001-05-18 alan Fixed a bug where we never found any functions on Linux because of the presence of the -Werror flag combined with all our other warnings. Also made the configure script tell you when it was defaulting the -Werror flag on for the current platform. CVS patchset: 535 CVS date: 2001/05/18 15:02:07 --HG-- extra : convert_revision : 436166b585cd3517f2ef419c5c4b626ee0371d90 Changed the heartbeat assert macro to use the autoconf HAVE_STRINGIZE definition. CVS patchset: 532 CVS date: 2001/05/17 23:30:06 --HG-- extra : convert_revision : 37d9ab18f32c775e7773e5b8daa980dda3852571 Changed so that -Wtraditional is not the default. CVS patchset: 530 CVS date: 2001/05/17 22:43:04 --HG-- extra : convert_revision : 474c4c0fdf0a9c35d21e27d0a67671d63a4bb3fc 2001-05-17 alan Added a new replacement function for scandir. CVS patchset: 527 CVS date: 2001/05/17 15:33:11 --HG-- extra : convert_revision : d9654079c6a8dc329a1187c27d9c9635d943dbf6 More minor portability/autoconf fixes. Added a feature called fatal_warnings to the configure process. CVS patchset: 525 CVS date: 2001/05/17 14:39:13 --HG-- extra : convert_revision : 5ec3c9562f6f412ef6537e621826f7d72ae09527 2001-05-15 alan More solaris porting changes. CVS patchset: 515 CVS date: 2001/05/15 19:27:17 --HG-- extra : convert_revision : b63a8d9eab6d4050c8df70f742242ceacd8a8a1d 2001-05-13 mmoerz fixed some missing $ in a check for a dlopen supporting libs CVS patchset: 514 CVS date: 2001/05/13 02:31:23 --HG-- extra : convert_revision : 898a2b5077e3b160dea0083ae0a70d71a63ea127 2001-05-12 alan Put in the latest portability fixes (aka autoconf fixes) CVS patchset: 513 CVS date: 2001/05/12 06:05:23 --HG-- extra : convert_revision : af93634333fd1cf780ee794989d0ab9ebdbc6729 2001-05-11 alan Fixed a typo: ip_fw.h was spelled ip_fwd.h CVS patchset: 512 CVS date: 2001/05/11 16:35:46 --HG-- extra : convert_revision : de059e42fe08d681c72ca20ae8a3ec7729ca0e23 Added a fix from Greg Freemyer to support the Tru64 locale directory. CVS patchset: 511 CVS date: 2001/05/11 15:32:47 --HG-- extra : convert_revision : 4f691d043549d0ed02b8d9dabee4e1960382fd5a Put in David Lee's locale generalization code. CVS patchset: 507 CVS date: 2001/05/11 12:06:59 --HG-- extra : convert_revision : 5169522bcc3d7acdabc3cf118346757055872a86 Deleted Makefiles from CVS and made all the warnings go away. CVS patchset: 504 CVS date: 2001/05/10 22:36:37 --HG-- extra : convert_revision : 7d4a8ffcf6ee30411c9d9cd6e8a97d73d6878752 2001-05-10 mmoerz added a better ldl/lintl detection, added an ugly locales detection CVS patchset: 501 CVS date: 2001/05/10 21:15:54 --HG-- extra : convert_revision : 9c7d94741b5c15176c79c0e2826554a2105988bf autoconf & automake & libtool changes * following directories have been added: - config will contain autoconf/automake scripts - linux-ha contains config.h which is generated by autoconf will perhaps some day contain headers which are used throughout linux-ha - replace contains as the name implies replacement stuff for targets where specific sources are missing. * following directories have been added to make a split up between c-code and shell scripts and to easy their installation with automake&autoconf - heartbeat/init.d containment of init.d script for heartbeat - heartbeat/logrotate.d containment of logrotate script for heartbeat - ldirectord/init.d similar to heartbeat - ldirectord/logrotate.d similar to heartbeat * general changes touching the complete repository: - all Makefiles have been replaced by Makefile.ams. - all .cvsingnore files have been enhanced to cope with the dirs/files that are added by automake/autoconf Perhaps it would be a nice idea to include those files, but the sum of their size if beyond 100KB and they are likely to vary from automake/autoconf version. Let's keep in mind that we will have to include them in distribution .tgz anyway. - in dir replace setenv.c was placed to available on platform where putenv() has to be used since setenv is depricated (better rewrite code -> to be done later) * following changes have been made to the files of linux-ha: - all .cvsignore files have been changed to ignore files generated by autoconf/automake and all files produced during the build-process - heartbeat/heartbeat.c: added #include - heartbeat/config.c: added #include * following files have been added: - Makefile.am: see above - configure.in: man autoconf/automake file - acconfig.h: here are additional defines that are needed for linux-ha/config.h - bootstrap: the shell script that 'compiles' the autoconf/automake script into a useable form - config/.cvsignore: no comment - doc/Makefile.am: no comment - heartbeat/Makefile.am: no comment - heartbeat/lib/Makefile.am: no comment - heartbeat/init.d/.cvsignore: no comment - heartbeat/init.d/heartbeat: copy of hearbeat/hearbeat.sh - heartbeat/init.d/Makefile.am: no comment - heartbeat/logrotate.d/.cvsignore: no comment - heartbeat/logrotate.d/Makefile.am: no comment - heartbeat/logrotate.d/heartbeat: copy of hearbeat/heartbeat.logrotate - heartbeat/rc.d/Makefile.am: no comment - heartbeat/resource.d/Makefile.am: no comment - ldirectord/Makefile.am: no comment - ldirectord/init.d/Makefile.am: no comment - ldirectord/init.d/.cvsignore: no comment - ldirectord/init.d/ldiretord: copy of ldirectord/ldirectord.sh - ldirectord/logrotate.d/Makefile.am: no comment - ldirectord/logrotate.d/.cvsignore: no comment - ldirectord//ldiretord: copy of ldirectord/ldirectord.logrotate - linux-ha/.cvsignore: no comment - replace/.cvsignore: no comment - replace/setenv.c: replacement function for targets where setenv is missing - replace/Makefile.am: no comment - stonith/Makefile.am: no comment CVS patchset: 495 CVS date: 2001/05/09 23:21:20 --HG-- extra : convert_revision : 02c31407342c78c95648aae8ff55c1de0fad4b1f